Polish startup wows with ‘deep fake’ voice cloning tool which can make Leonardo DiCaprio sound like Kim Kardashian
A Polish startup is wowing the tech world with a ‘deep fake’ voice cloning tool which can imitate ‘the whole gamut of human emotions’ using any voice.
The brainchild of Piotr Dąbkowski, a former Google machine learning engineer, and Mati Staniszewski, a former Palantir deployment strategist, their company Elevenlabs creates both speech synthesis and voice cloning which is capable of replicating a human voice and any accent by “relying on high compression and context understanding to render human speech ultra-realistically.”
Hoping that their mimicking tools will take over cinema dubbing and audiobooks, and in so doing turn their startup into a billion-dollar company, last month Czech venture capital firm Credo announced that they will be leading a $2 million pre-seed round for the company.
The startup came to prominence in September last year when it posted a short video on YouTube showing Leonardo DiCaprio speaking from the stage at the UN Climate Summit.
After the first four seconds he starts speaking in the voice of famous people like Joe Rogan, Steve Jobs, Robert Downey Jr., Bill Gates, and Kim Kardashian, imitating each one's speech pattern, tone of voice, and emotions perfectly.
The technology is not without controversy, though.
The high quality of the cloned voices and the apparent ease with which people developed them have made many people cautious about the potential threat of deepfake audio clips.
Recently, internet trolls on anonymous imageboard website 4chan used ElevenLabs to make deepfake voices of Emma Watson, Joe Rogan, and others saying racist, transphobic, and violent things.
In the uploads, a computer-generated voice that sounds like Emma Watson reads some text from Mein Kampf.
In another, someone whose tone is strikingly similar to that of Ben Shapiro attacks Alexandria Ocasio-Cortez on racial grounds.
The company is now exploring more safeguards around its technology. These include manual verification of every voice cloning request or the requirement of payment information or "complete ID identification" before starting a cloning process.
Meanwhile, the startup's short-term goal is to have its services function across all languages.
In the future, the company says it wants all audio to be handled not by actors or voiceovers, but by intelligent bots.
It also wants to develop speech synthesis tools that will convert speech into any language instantly.