Английская Википедия:ElevenLabs

Материал из Онлайн справочника
Перейти к навигацииПерейти к поиску

Шаблон:Short description Шаблон:Use mdy dates Шаблон:Infobox company Шаблон:Artificial intelligence

ElevenLabs is a software company that specializes in developing natural-sounding speech synthesis and text-to-speech software, using artificial intelligence and deep learning.

It has been recognized as one of the major companies behind the ongoing AI Spring.[1]

History

ElevenLabs was co-founded in 2022 by Piotr Dąbkowski, an ex-Google machine learning engineer and Mateusz Staniszewski, an ex-Palantir deployment strategist.[2] Both were raised in Poland, and their inspiration for founding ElevenLabs reportedly came from watching inadequately dubbed American films.[3][4]

Dąbkowski and Staniszewski initially considered different funding options, including the possibility of collaborating with a startup accelerator. In January 2023 they revealed having secured a $2 million pre-seed round. The startup's specialization in AI voice intelligence, a still-emerging field in Europe, played a significant role in attracting investors. The pre-seed funding was primarily led by Credo Ventures, and joined by Concept Ventures.[5]

In January 2023, ElevenLabs publicly released its beta platform.[6]

In June 2023, ElevenLabs raised a $19 million Series A funding round at a valuation of about $100 million,[7][8] despite the company having no office and only 15 employees.[4][8] The funding round was co-led by the venture capital firm Andreessen Horowitz, ex-GitHub CEO Nat Friedman, and entrepreneur Daniel Gross. It also saw participation from prominent individuals such as SV Angel, Mike Krieger (co-founder of Instagram), Brendan Iribe (co-founder of Oculus), Mustafa Suleyman (co-founder of Deepmind), and Tim O'Reilly (founder of O'Reilly Media). It was also announced that Andreessen Horowitz would be joining ElevenLabs' board.[3]

On January 22, 2024, ElevenLabs raised an additional $80 million in Series B funding raising the total valuation of the company to $1.1 billion. The funding round was led by Andreessen Horowitz, Friedman, Gross, and Sequoia Capital. Additionally, the company announced a series of new products, such as their Voice Marketplace, AI Dubbing Studio, and mobile app. [9]

Products

ElevenLabs is primarily known for its browser-based, AI-assisted text-to-speech software, Speech Synthesis, which can produce lifelike speech by synthesizing vocal emotion and intonation.[10] The company states its software is built to adjust the intonation and pacing of delivery based on the context of language input used.[11] It uses advanced algorithms to analyze the contextual aspects of text, aiming to detect emotions like anger, sadness, happiness, or alarm, which enables the system to understand the user's sentiment,[12] resulting in achieving a more realistic and human-like inflection. The startup is in the process of patenting this technology.[5] Through its beta site, users can submit text and generate audio files from a selection of default voices. Paying users are given the ability to upload custom voice samples to create new vocal styles using the company's voice cloning tool.[13]

Voice Library is the company's feature for sharing unique voice profiles created using their Voice Design technology. These pre-designed voice profiles allow users to select a voice that best suits their needs, rather than creating one from scratch.[14] Another tool called VoiceLab allows users to clone voices from just a few short snippets of audio and can create entirely new synthetic voices.[3]

On 20 June 2023, ElevenLabs released an AI recognition tool called the AI Speech Classifier, which it claims is the first of its kind.[3] The tool is accessible through an API and designed to determine if an uploaded audio sample originates from ElevenLabs' proprietary AI technology.[4] The company has expressed its intention to collaborate with other AI developers in creating a universal detection system that could be adopted industry-wide.[15]

In July 2023, ElevenLabs announced "Projects", a tool for creating long-form spoken content such as audiobooks and dialogue segments with contextually-aware synthetic or custom voices.[4][16] The tool was released in September. In August, ElevenLabs expanded its voice generation capabilities to 28 languages. Using an in-house AI model, it automatically detects languages like Korean, Dutch, and Vietnamese, allowing for "emotionally rich" multilingual speech generation. The company also announced that its technology had officially exited its beta phase.[17][18]

In October 2023, ElevenLabs presented "AI Dubbing," a tool that is able to translate speech into more than 20 languages. The feature is capable of preserving the speaker's original voice, emotions, and intonation, by employing proprietary methods to handle tasks like noise removal, speaker differentiation, transcription, and synchronization of translated speech with the original audio.[19]

Uses

ElevenLabs' use cases span a range of sectors.

Content creators have used ElevenLabs for podcasts, narration, and comedy shows.[20][21][22] In March 2023, comedian Drew Carey used ElevenLabs' voice cloning tool to recreate his voice for an episode of his radio show, Friday Night Freakout.[11] In April 2023, Polish TV and radio presenter Jaroslaw Kuzniar has also used a synthesized version of his voice to deliver a series of podcasts on the War in Ukraine.[23] Seth Godin has also used ElevenLabs to narrate his AI-focused podcast.[3]

In March 2023, Super-Hi-Fi, a streaming automation service, partnered with ElevenLabs to launch a fully automated radio service called "AI Radio", using ElevenLabs' software to voice its virtual DJ from prompts generated with ChatGPT.[24] ElevenLabs has also been employed for narrating games and voicing game characters in partnerships with Swedish game developer Paradox Interactive and the United Kingdom-based Magicave.[3][25]

Publishers and authors have used ElevenLabs to narrate audiobooks and newsletters.[5][26] On 13 June 2023, Storytel announced an exclusive partnership with the company. In this collaboration, ElevenLabs will create voices tailored specifically to Storytel's core markets and to produce AI-narrated audiobooks. A voice-changing feature called VoiceSwitcher was implemented to enhance personalization for users, providing unique listening experiences customized for each individual.[27][28]

ElevenLabs has been used to generate audio for dubbing videos in different languages, including by content creators.[5][8] The platform has the capability to accurately replicate almost any accent in any language.[29] Celebrity fans have used ElevenLabs to create inspirational messages using the voices of their favorite celebrities.[30]

In February, VICE reporter Joseph Cox published findings that he had recorded five minutes of himself talking and then used ElevenLabs to create voice deepfakes that defeated a bank's voice-authentication system.[31]

ElevenLabs sets explicit guidelines regarding the use of its technology, forbidding the cloning of voices for abusive purposes such as fraud, discrimination, hate speech, or online abuse, although it does support the use of its platform for “caricature, parody and satire” and “artistic and political speech contributing to public debates." The company asserts its authority to suspend the accounts and content of users found in violation of these guidelines, and it also highlights its commitment to cooperate with authorities and report any illegal activities in accordance with applicable laws.[3] In January, the company admitted that its platform has been used for “voice cloning misuse cases”[32] and toughened its safeguards against vexatious use of its technology.[33]

Reception

Following its launch in January 2023, ElevenLabs gained rapid momentum and was commended for its voice output quality, fast generation times, and a "generous free tier". It has also been praised for its ability to accurately pronounce names with unique or uncommon pronunciations, addressing a common shortcoming in similar tools that often cater primarily to Western names.[34] The company reached over one million registered users between its launch and June 2023.[3][4][35]

Criticism and controversy

ElevenLabs was criticized after users were able to abuse its software to generate controversial statements in the vocal style of celebrities, public officials, and other famous individuals,[36][37][38][39][33] particularly attracting attention after users on 4chan used the tool to share hateful messages.[40][15] The software's ability to closely copy real voices has raised ethical concerns, with critics considering it a form of deepfaking.[41] In response, the company said it would work on mitigating potential abuse through safeguards and identity verification.[6] The company has subsequently limited access to its voice cloning feature to paid subscribers,[42] citing the requirement to provide payment information as means for improving accountability,[43] and has implemented bans on users who repeatedly violate the terms of service.

In the leadup to the January 2024 New Hampshire democratic primary, AI-generated robocalls seemingly from Joe Biden encouraging voters to skip voting on the day of the primary were sent out to thousands of residents. The New Hampshire attorney general's office launched an investigation into the incident and linked it to a company based in Texas, with audio experts concluding the call was made using ElevenLabs. In response to the incident, CEO Mati Staniszewski stated that the company was “dedicated to preventing the misuse of audio AI tools” but provided no comment on specific incidents.[44]

Additional concerns have been raised over the ethics of the source of ElevenLabs' training data, with multiple voice actors claiming ElevenLabs used samples of their voices without their consent.[45] ElevenLabs, along with other companies in its category, has thus been seen as a potential challenge to the voice acting sector.[18]

See also

References

External links