Nearly 400,000 people subscribe to the YouTube account Rob the Robot – Learning Videos For Children. In one 2020 video, the animated humanoid and his friends visit a stadium-themed planet and attempt feats inspired by Heracles. Their adventures are suitable for the elementary school set, but young readers who switch on YouTube’s automated captions might expand their vocabulary. At one point YouTube’s algorithms mishear the word “brave” and caption a character aspiring to be “strong and rape like Heracles.”
A new study of YouTube’s algorithmic captions on videos aimed at kids documents how the text sometimes veers into very adult language. In a sample of more than 7,000 videos from 24 top-ranked kids’ channels, 40 percent displayed words in their captions found on a list of 1,300 “taboo” terms, drawn in part from a study on cursing. In about 1 percent of videos, the captions included words from a list of 16 “highly inappropriate” terms, with YouTube’s algorithms most likely to add the words “bitch,” “bastard,” or “penis.”
Some videos posted on Ryan’s World, a top kids’ channel with more than 30 million subscribers, illustrate the problem. In one, the phrase “You should also buy corn” is rendered in captions as “you should also buy porn.” In other videos, a “beach towel” is transcribed as a “bitch towel,” “buster” becomes “bastard,” a “crab” becomes a “crap,” and a craft video on making a monster-themed dollhouse features a “bed for penis.”
“It’s startling and disturbing,” says Ashique KhudaBukhsh, an assistant professor at Rochester Institute of Technology who researched the problem with collaborators Krithika Ramesh and Sumeet Kumar at the Indian School of Business in Hyderabad.
Automated captions are not available on YouTube Kids, the version of the service aimed at children. But many families use the standard version of YouTube, where they can be seen. Pew Research Center reported in 2020 that 80 percent of parents to children 11 or younger said their child watched YouTube content; more than 50 percent of children did so daily.
KhudaBukhsh hopes the study will draw attention to a phenomenon that he says has gotten little notice from tech companies and researchers and that he dubs “inappropriate content hallucination”—when algorithms add unsuitable material not present in the original content. Think of it as the flip side to the common observation that autocomplete on smartphones often filters adult language to a ducking annoying degree.
YouTube spokesperson Jessica Gibby says children under 13 are recommended to use YouTube Kids, where automated captions cannot be seen. On the standard version of YouTube, she says the feature improves accessibility. “We are continually working to improve automatic captions and reduce errors,” she says. Alafair Hall, a spokesperson for Pocket.watch, a children’s entertainment studio that publishes Ryan’s World content, says in a statement the company is “in close and immediate contact with our platform partners such as YouTube who work to update any incorrect video captions.” The operator of the Rob the Robot channel could not be reached for comment.
Inappropriate hallucinations are not unique to YouTube or video captions. One WIRED reporter found that a transcript of a phone call processed by startup Trint rendered Negar, a woman’s name of Persian origin, as a variant of the N-word, even though it sounds distinctly different to the human ear. Trint CEO Jeffrey Kofman says the service has a profanity filter that automatically redacts “a very small list of words.” The particular spelling that appeared in WIRED’s transcript was not on that list, Kofman said, but it will be added.