This story is part of a series on how we watch stuff—from the emotional tug of Facebook video series to the delight of Netflix randomness.
If you rely on YouTube’s captions, good luck. In a recent random sample, the common phrase “You’re on your own” was captioned “You won you’re wrong.” “Ethan has to leave” came out “ether nice to leave.” “Met” became “wet”—and “wedding,” somehow, “lady”—until finally the videos collapsed into unintelligibility. In this bizarre and silent version of YouTube, people don’t ask you to “subscribe and turn on notifications.” They ask you to “subscribe and turn on other patients.” It’s dark.
For people who are deaf or hard of hearing, making sense of videos online can be deeply frustrating, even if the video is captioned, which is now the norm (if not the law) on most platforms. YouTube’s captions are often garbled, because, unless YouTubers themselves intervene and manually type out the correct words, they’re auto-generated, the best efforts of a closed-captioning algorithm the company has been tweaking for years. Appreciative of the effort but unconvinced by the results, activists have dubbed them “craptions.”
A dive into how we watch stuff.
One activist is Rikki Poynter, who runs a “No More CRAPtions” social media campaign. Poynter’s hearing faded gradually, but she started really losing her hearing after she graduated from high school in 2009. By 2010, when she was routinely uploading beauty tutorials to YouTube, participating online was challenging. “Even with earphones in, it was becoming a struggle to understand what was being said,” Poynter says. The auto-generated captions weren’t a huge help. “‘Zebra’ would be said instead of ‘concealer,’” she says. She began leaving messages on other beauty YouTubers’ channels, imploring them to add correct captions to their videos.
Even after celebrity YouTuber Tyler Oakley gave her a shout-out—the YouTube equivalent of being one of Oprah’s favorite things—Poynter got little sustained response from her online community. “It hurts to be ignored,” she says. YouTubers would promise to prioritize captioning and then fall off after a few videos, letting the algorithm resume the work. “‘Craptions’ isn’t a new term by any means,” Poynter says. “The deaf community has been coming together on this for a long time now—though it seems I’ve had the most success, and I have been one to still constantly push on it.”
Caption experts are quick to point out that even those with perfect hearing can benefit from using captions. They assist English language learners, of course, but also native speakers struggling to understand, say, a thick Scottish brogue. And when you’re scrolling through your feed with the sound off, they’re a necessity. James Rath, a YouTuber and filmmaker, says captions can expand a video’s reach and performance, since search engines may pull keywords from the transcript.
For businesses, failing to provide adequate captions can result in a lawsuit, which has proved troublesome for streaming services like Netflix, along with major broadcasters like CNN. Individual social media content creators, by contrast, are unlikely to be found in violation of the Americans with Disabilities Act. Nor do the platforms themselves enforce stringent requirements. As a general rule, the deaf community tends to see Facebook’s system (and auto-captions) as pretty good, while Twitter and Instagram can prove kludgy and awkward. YouTube falls somewhere in the middle—but is unquestionably the internet’s video behemoth. A boycott is simply impractical, and even the auto-captioning system’s sharpest detractors, like Poynter, admit it’s gotten better. “The inaccuracies used to be way worse,” she says. YouTube won’t offer specifics but acknowledged that the algorithm still needs improvement, which is why the company encourages creators to edit the captions or add their own.
Manual captioners likely wouldn’t make mistakes on par with zebra/concealer, but they’re not infallible either. YouTube gives you captioning options: If you don’t want to use the auto-generated ones, you can upload your own, or allow your viewers to write and upload their versions. The audience-generated captions can be great, depending on the community—often, they are foreign-language translations. They can also be confusing. “A lot of us find that community contributions are not legible,” Poynter says. “The worst offender has been actual paragraphs written as captions. I’m talking a caption block that takes up half the video screen! You actually can’t see what you’re supposed to be seeing because it’s covered by words.” Deaf YouTuber Jessica Kellgren-Fozard has videos dedicated to explaining the etiquette around captioning. Another faux pas: using the captions as a place to add jokey commentary. “Jokes in the captions drive me up the damn wall,” reads the video’s top comment. “Like, I didn’t come to this Youtuber’s video to be subjected to a random captioner’s personal stand-up night.”
The responsible thing for most people to do, in Poynter’s estimation, is to pay for professional captioning. Captioning services follow rigorous style guides to ensure consistency and clarity. Rev, a go-to service for many YouTubers, is a gig-economy special: 40,000 people working from home captioning videos for the price of $1 per minute. Standards are clear. If there’s text on-screen, captions should appear at the top and not the bottom. Songs lyrics are framed by musical notes; gunfire or door slamming should be bracketed. According to Jason Chicola, Rev’s founder and CEO, this is plenty good enough for YouTube—but not, say, Netflix, which requires perfect timing, down to the frame, every word popping up in perfect synchronicity with an actor’s speech.
But even Netflix isn’t perfect. In captions, nobody has an accent—which isn’t a big deal unless it’s important to their character. Captions often reduce dialog and sound effects to what Sean Zdenek, author of Reading Sounds: Closed-Captioned Media and Popular Culture, calls “a single sonic plane.” You might hear a dog barking in the distance while someone speaks, but, because of space constraints, captions might not indicate relative volume. Captions also linearize sounds taking place at the same time because people have to read words one after another. Deaf and hard-of-hearing people are sometimes experiencing a movie or video out of sync with hearing audiences. “My son was born deaf,” Zdenek says. “When we watch movies together, he would laugh at jokes before they were uttered by the actors.”
Captions are an art form, requiring the distillation of an entire landscape of sound—music, speech, background noise—into tweet-sized, speed-readable lines. “You see a narrative boiled down to just a few key sounds,” Zdenek says. “I’d love to see producers and directors work more closely with captioners. Even on a movie that costs many millions of dollars, captions are usually a couple-thousand-dollar rush order done in 24 to 48 hours.”
Zdenek and Poynter hope their work will not just encourage wide-scale caption adoption but also move the medium forward. Zdenek sees unused potential in the text of captions themselves. “In the UK, they use a different color for each speaker,” he says. “I’ve also explored using effects, like a smoky effect to the lyrics of a scary lullaby chanted by ghostly children.” Poynter hopes online content captions will catch up with the conventions of movie and TV captions—she wants the music, the doors closing, the birds singing.
Right now, she’d settle for not being harassed. It wasn’t the misheard words or the unreadable community-generated captions that transformed Poynter from beauty guru to caption crusader. “What really pushed me over the edge was when people would abuse the community contribution service on YouTube and use it to troll,” she says. Poynter has found captions overrun with commentary that ranges from nasty to inane: interactions between two men labeled “gay porn,” stories about car crashes punctuated with remarks like “Shitty drunkards amirite?” people using celebrity YouTuber’s captions as a place to promote their own channels, entire captions being replaced by the word “meow,” over and over again.
She and others have noticed troll captions most often on huge channels belonging to gamers like Markiplier, J