You are using an outdated browser. For a faster, safer browsing experience, upgrade for free today.

Possibility of "voice clone" technology that multiply your voice

In 2020, the launch of services was noticeable in the voice clone field.

For example, the Ukrainian base "Respeecher" offers voice conversion technology for the entertainment industry.If you use Respeeecher, you can convert the recording audio to the voice of the person who read it in AI in advance.The company has procured $ 1.5 million in March.

RESPEECHER has Speech-to-Speech technology instead of text-to-speech.Instead of AI reading the text content, it can be changed to the voice of the person who wants to convert it while remaining the intonation and voice of the speaker.Currently, the customer has a Hollywood production company, and if the audio data of the voice actor is read, the narration work can be proceeded at a low cost, so there is no need to hire a high -priced voice actor.

自分の声を多言語化する「音声クローン」技術の可能性

Because Respeeecher is available for the same language, it is possible to handle it from Japanese to Japanese (the same is true for other languages).It takes 3-4 weeks for the AI to learn, and the price range from 1 million yen per voice conversion may be a bottleneck, but as you can see the PV, the degree of perfection is considerable.It is inevitable that it will come from a corporate price to a consumer price, so we look forward to future technical progress.

In July, the Descript, an audio editing tool for PodCast, which is also invested by Andreessen Horowitz, launches the OVERDUB function.It is a synthetic service that reads the text contents to AI audio. It is a natural audio of the necessary expressions while messing with the editing screen of Descript.Unlike Google and Amazon's AI audio reading, high -precision reading technologies are sold.

However, the task of the two companies is that they cannot exceed the "Across Launage", which refers to the wall between languages.It is still not possible to read Japanese content as an English speaker.This is because there is a difference in accents, and even if you read it to AI, you will feel uncomfortable."Resemble" is trying to cross this wall.AI ".

October, Resemble.AI announced the local language speech AI service "LOCALIZE" and took a step to multiply his voice (however, the sound of teacher's data is English native).You can convert English audio to France, Germany, the Netherlands, Italy, Spain, and Chinese.Japanese and Korean will be launched soon.

If this goes smoothly, the day when overseas content will come in Japanese localized form will be near.It may be the default to distribute PODCAST content to voice platforms around the world.I believe that the market change of deep fake technology starts from audio and expands to videos, and the flow of Podcast and YouTube content diffusion significantly.The movement on the platform side will also change, and the localization strategy by language should also change.Attention is focused on trends in whether media companies can make the most of these technical breakthroughs.

Bridge Members

BRIDGEでは会員制度「Bridge Members」を運営しています。会員向けコミュニティ「BRIDGE Tokyo」ではテックニュースやトレンド情報のまとめ、Discord、イベントなどを通じて、スタートアップと読者のみなさんが繋がる場所を提供いたします。登録は無料です。
無料メンバー登録