These Deepfake Voices Can Help Trans Gamers

Fred, a trans man, clicked his mouse, and his tenorful tones suddenly sank deeper. He’d switched on voice-changing algorithms that provided what sounded like an instant vocal cord transplant. “This one is ‘Seth,’” he said, of a persona he was testing on a Zoom call with a reporter. Then, he switched to speak as “Joe,” whose voice was more nasal and upbeat.

Fred’s friend Jane, a trans woman also testing the prototype software, chuckled and showcased some artificial voices she liked for their feminine sound. “This one is ‘Courtney’”—bright and upbeat. “Here’s ‘Maya’”—higher pitched, sometimes by too much. “This is ‘Alicia,’ the one I find has the most vocal variance,” she concluded more mellowly. The glitches were slight enough to prompt the fleeting thought that the pair may not have joined the call with their “real” voices to begin with.

Fred and Jane are early testers of technology from startup Modulate that could add new fun, protections, and complications to online socializing. WIRED is not using their real names to protect their privacy; trans people are often targeted by online harassment. The software is the latest example of the tricky potential of artificial intelligence technology that can synthesize real-seeming video or audio, sometimes termed deepfakes.

Modulate’s cofounders Mike Pappas and Carter Huffman initially thought the technology they term “voice skins” could make gaming more fun by letting players take on characters’ voices. As the pair pitched studios and recruited early testers, they also heard a chorus of interest in using voice skins as a privacy shield. More than 100 people asked if the technology could ease the dysphoria caused by a mismatch between their voice and gender identities.

“We realized many people don’t feel they can participate in online communities because their voice puts them at greater risk,” Pappas, Modulate’s CEO, says. The company is now working with game companies to provide voice skins in ways that offer both fun and privacy options, while also pledging to prevent them becoming a tool of fraud or harassment themselves.

Games such as Fortnite and social apps like Discord have made it common to join voice chats with strangers on the internet. As with the early days of texting via the internet, the voice boom has unlocked both new delights and horrors.

The Anti-Defamation League found last year that almost half of gamers had experienced harassment via voice chat while playing, more than via text. A sexist streak in gaming culture causes women and LGBTQ people to be singled out for special abuse. When Riot Games launched team-based shooter Valorant in 2020, executive producer Anna Donlon said she was stunned to see a culture of sexist harassment quickly spring up. “I do not use voice chat if I’m going in alone,” she told WIRED.

Modulate’s technology is not yet widely available, but Pappas says he is in talks with game companies interested in deploying it. One possible approach is to create modes within a game or community where everyone is assigned a voice skin to match their character, whether a gruff troll or knight in armor; alternatively, voices could be assigned randomly.

In June two of Modulate’s voices launched inside a preview of an app called Animaze, which transforms a user into a digital avatar in livestreams or video calls. The developer, Holotech Studios, markets the voices as both a privacy feature and way to “morph your voice to better fit a character with different age, gender, or body type than your own.” Modulate also offers game companies software that automatically notifies moderators of signs of abuse in voice chats.

Modulate’s voice skins are powered by machine learning algorithms that adjust the audio patterns of a person’s voice to make them sound like someone else. To teach its technology to voice many different tones and timbres, the company collected and analyzed audio from hundreds of actors reading scripts crafted to provide a wide range of intonation and emotion. Individual voice skins are created by tuning algorithms to replicate the sound of a specific voice actor.

Pappas says the process creates a delay of only about 15 milliseconds—making it essentially unnoticeable. The company also adds digital watermarks to its voices designed to be undetectable to the human ear but obvious to audio software as a safeguard against fraud.

Software that can change a person’s voice is not a new concept, but existing technology is often obtrusive or grating, masking a person’s voice rather than replacing it. In voice chats, that can attract unwanted attention. Last month, superstar Twitch streamer Pokimane revealed she had tried voice-changing technology to avoid harassment often aimed at female gamers. It did not go well. “I sounded like a robot LOL,” she tweeted.

Modulate’s voice skins sounded strikingly real in demos from company staff and early testers. Most glitches resembled the distortions common on conventional phone calls, such as an occasional robotic note or flattened tone—although it also felt likely that a person could with practice teach themselves to detect the technology.

Modulate declined to provide a version of the software to test; Animaze, the preview augmented reality app, was glitchy with the hardware WIRED had available. Pappas claims to have charmed investors by joining Zoom calls with a voice skin, and only later revealing his natural voice. He says staff tests in public voice chats have gone similarly undetected.

Fred and Jane signed up to join Modulate’s early testers because they saw voice-altering algorithms as a way to gain new control of how their gender is perceived online. The pair first became friends via Discord and fell into the habit of voice chatting daily, playing games and sharing their experiences with gender transition.

Both had met only partial success using vocal techniques to shift their sounds closer to their gender identities, Fred towards the masculine and Jane the feminine. “My voice has changed, but not as much as maybe I wanted it to,” says Fred, who underwent hormone treatment. “I tended to avoid voice chats.” Jane spent hundreds of dollars on voice-altering audio hardware with disappointing results. “It added a chipmunky quality that is a little unpleasant to listen to,” she says.

The friends found Modulate’s technology more pleasing and convincing. “The quality was really impressive compared with other things I had tried,” Fred says. Taking time to figure out which voice skins work well with your own, and how to project your intonations through the algorithmic overlay also helps. Nonverbal sounds such as coughs or laughs can trigger giveaway glitches in the AI voice.

Modulate asks testers not to use the technology in the wilds of the public internet, but Fred and Jane say trans friends have been impressed. They now generally don their voice skins whenever they talk together. “It’s just nice to sound like ourselves together,” she says.

Trans people have often pioneered new uses of technology that can tune or obscure identity, says Tee Chuanromanee, who researches human-computer interaction at the University of Notre Dame. Virtual world Second Life provides one example.

“Technology has opened up a lot of avenues for exploring new aspects of yourself and connecting with others,” Chuanromanee says, and voice skins sound promising. At the same time, any new technical adaptation won’t change the underlying reason many trans people are wary of public spaces, digital or otherwise. “Concerns about safety and privacy will always be in the background—am I going to get outed?” Chuanromanee says.

Fred and Jane say they hope Modulate’s technology will eventually help them to venture into public online spaces, such as streaming, more comfortably. They won’t abandon more conventional voice techniques, though. “It’s important to be able to work on this because I’m not going to be able to bring this to the supermarket,” Jane says. “Much as I would like to.”


More Great WIRED Stories

Read More

Tom Simonite