In today’s online economy, more streamers have found fame and fortune as celebrities with compelling digital identities that interact daily with their fans. They no longer just play or create video games and viral videos, but also build up online personas that amuse their audiences, disguise their real identities or intimidate gaming opponents. This is in part made possible by voice modification technology – long used to create superheroes’ voices in movies and manga shows – that changes the pitch or tone of a user’s voice, and adds amplification or distortions.
Voicemod is one company that has managed to scale voice modification tech for a bigger market. Founded in Valencia in 2014 by Jaime, Fernando and Juan Bosch, three tech-loving brothers and musicians, Voicemod has leveled up its voice filters by integrating them into popular video games such as GTA V, Minecraft, and Fortnite.
Today, Voicemod is profitable and a top name in immersive audio experiences, voice augmentation and voice modulation. It has more than 2.5m monthly active users across 65 countries, 99% of which are located outside of the company’s native Spain. Its technology has also been integrated into software for broader mainstream audiences, such as WhatsApp desktop, Zoom and Skype. Business has grown with the pandemic. As of this September this year, Voicemod had expanded its team almost fourfold to 127 employees, up from 33 pre-Covid.
Incubated by Demium Startups and backed by investors such as Wayra, the CVC arm of Telefonica, Spain’s national telco, Voicemod last year raised $8m in funding round led by e-sports and gaming investor BITKRAFT Ventures. The startup is using the money to invest in AI voice and speech conversion technology for more unique and realistic voices and expand in Asia, the world’s largest gaming market, with 1.48bn gamers.
Riding the Fortnite, Minecraft wave
Voicemod had its beginnings in 2009, when the Bosch brothers created a B2C music app as a side hustle to the studio business that was their main job. As part of this project, they experimented with voice modulation.
“The result of this was what we called the “Voicemod Experience” – a completely new way to experience your own voice – which became the driving force of the app’s evolution," said Jaime Bosch, Voicemod’s CEO.
“This led us to reshape our vision for the product, into something that could ultimately evolve human connection through the medium of sound. So we brought the experience from mobile to PC, where it was instantly picked up by the exploding gaming and streaming scene – and the rest is, as one says, history.”
Voicemod’s shift to real-time voice modification for gamers and streamers proved its ticket to success. This move coincided with the rise of games like Minecraft and Fortnite, which entice players to spend generously on in-game enhancements to avatars’ appearances and abilities. For example, 70% of Fortnite players admit to having spent money for such “extra” features.
Technologically, Voicemod was ahead of the pack. According to Jaime Bosch, at that point "most voice changing technology was asynchronous, so to be able to experience being someone else in a real-time setting was novel.” Using AI, advanced voice recognition technology and digital signal processing, voices were transformed and made unrecognizable. Female voices could be made to sound male, and vice versa, while children could be made to sound like adults.
These enabled gamers to complete their avatars not only with skins and gadgets, but also the right voices. In addition, Voicemod’s tools promoted online safety by masking the voices of women and children, shielding them from online bullies and keeping real-life identities private.
Subsequently, the company introduced context-related effects for more immersive gaming. These allowed gamers to play an echo when entering a tunnel in GTA V, for instance, make themselves sound short of breath when launching off in a parachute in Fortnite, or transform a single voice into a thousand when sending virtual armies into battle.
Voicemod also found success beyond gaming. “The defining moment for us … was the realization that people were using our technology to not just have fun, but to shape their entire way of expressing themselves online. This is when we realized that we were building something that wasn’t just about entertainment, but possibly the next step in the future of social audio experiences,” Jaime Bosch said.
Besides voice-changing software, Voicemod has created a sound library, a custom voice-creation tool and a software development kit for developers. The company’s soundboard tool offers “audio emojis,” which can be used in digital chats just like stickers, voice notes or GIFs.
Content creators are a major market segment. In gaming, where users are constantly looking for new tactics and tools that would help them impress other gamers and increase their influence on the game, Voicemod similarly seeks to let streamers and content creators monetize their broadcasts. Ibai Llanos, the streamer and esports commentator who is a celebrity at aged 26, is an example of such Voicemod users.
Voicemod Bits, the company’s free plug-in extension for livestreaming platform Twitch, helps streamers increase engagement and monetization by allowing viewers to change a streamer’s voice as they tune in to watch a livestream. Streamers can set the voice filters on offer, by tapping on Voicemod’s library of voice effects. They can also set how much viewers need to pay to activate them and how long the voice filters remain available.
Voicemod has a team of sound engineers focused on creating new voices and meeting specific user demands. Using machine learning, the company has analyzed abstract hidden structures within speech, such as phonology, content, identity, intention and mood, in order to design tools giving users unprecedented control over their perceived voice identities. It also watches emerging trends to come up with new products. For instance, it was inspired by virtual concerts held on Fortnite to create a function for users to sing in the same voice as their favorite artistes.
Moving forward, the company plans to use neural networks to create completely new voices from scratch, rather than altering a user’s original voice by changing pitch or adding reverberation and echoes. When ready, this groundbreaking technology will allow even greater personalization. Voicemod also hopes to make its solutions more accessible and draw more developers to use its tools.
“Building real-time voice changing technology and developing a system of fully customizable sonic expressions is a lot of work. Our team has taken that step out of the equation by designing an entire kit that can easily be integrated by developers anywhere,” Jaime Bosch said.