I wanted to see how alike I could be to HAL-9000 in my own, ragged rendition of the song Daisy Bell. It turns out, I actually sound better as the vague approximation of the demented artificial intelligence from 2001: A Space Odyssey than either HAL’s actor Douglas Rain or my normal, human voice.
The Spain-based company Voicemod revealed its new beta AI Voices product Wednesday that promotes it uses AI to change the voice while maintaining inflection, emotion and pacing. The app lets users recreate a March of the Penguins-esque narration of their coworkers waddling to the bathroom in the voice of God himself (AKA, Morgan Freeman, the only fictional president who would get my vote). The app also has a voice for a generic male or female, alongside more outlandish voices to make you sound like HAL-9000 or like your speech is crackling in over an astronaut or airplane pilot’s radio.
Your mileage may vary for speech-to-speech sound transformation. Based on our tests, the better quality microphone you’re using, the more accurate your sound will be. That’s common enough for these kinds of apps. Some of the voices seemed more natural than others. “Morgan” sounded slightly off, not hitting the perfect inflections of the famed narrator and actor. While they’re both fun, there’s nothing among the astronaut or pilot voices that would really convince either the U.S. Air Force or NASA that you’re coming in for a sudden, unexpected landing. Still, the tech could be good to troll your friends in voice chat or make for a more effective tabletop roleplaying session.
In a release announcing the new feature, Voicemod said the technology is especially useful for streamers, content creators, roleplayers, or even for people in the trans community trying to express themselves online. The company quoted streamer and esports coach/organizer Kairi Caitlyn, who said further development of AI voices could be a benefit to “people like me in the trans community.”
It means the masculine “Bob” and feminine “Alice” voices are perhaps the most interesting, and potentially consequential for people trying to express their identities when appearing online. Other startups are also working on AI technology that facilitates the trans community online. Many of these companies, including Voicemod, refer to the nebulous “metaverse” when describing how this technology could be used to create personalized avatars. As much as companies like Meta want to let you recreate yourself in a virtual world, others would prefer to express their inner selves when creating online personas.
CEO and co-founder Jaime Bosch wrote that the tech “enables a previously impossible level of customization in audio expression online and in the metaverse.” In a statement to Gizmodo, Bosch said they want to support the full slate of trans, nonbinary, gender-fluid, and gender-queer users to “better empower them to build their own unique sonic identities.”
The flip side of this technology is the potential for abuse. The rise of deepfake technology has some in the political and tech worlds concerned, though deepfakes still have a long way to go if they want to be more effective than good old fashioned online disinformation. These most recent speech-to-speech systems are not likely to fool most people, but even Voicemod recognized that security is a concern. Bosch said they are working on systems to identify whether somebody is using a synthetic voice.