AI News, Google's newest voice AI is indistinguishable from real human speech artificial intelligence
This Startup's Artificial Voice Sounds Almost Indistinguishable From A Human's
An Irish startup has claimed a breakthrough in text-to-speech synthesis that improves on public demonstrations by Google’s DeepMind and Facebook.
It sounds eerily human, and shows thatyou no longer need a multi-billion dollar R&D budget or hundreds of engineers to produce an artificial voice that’s as good as Google’s.
Voysis founder Peter Cahill insists that the sample above has not been pre-recorded by a human, but produced by an algorithm that was trained on a popular dataset for building text-to-speech software.
The method uses a particular type of neural-network architecture to create sound, and is said to represent a significant leap forward in artificial-voice technology.
The development comes at a time when digital assistants are becoming more popular because they exist not only on smartphones but on smart speakers in the home, an environment where consumers feel more comfortable to speak out loud to devices.
The method represents 50% improvement in artificial voice generation, according to Google’s own research, after years of tiny, incremental improvements to popular use-cases like Siri and Alexa.
Natural speech typically gets a score of 4.5 (factoring in the suspicion listeners have when asked to judge the naturalness of speech.) Traditional artificial voices, like the ones you hear on with train and bus announcements or from digital assistants, score at around 3.8.
The most popular method for creating an artificial voice till now has been to use the so-called concatenative method, which involves recording a huge amount of voice data and slicing it up into small units.
Not only that, but Google's DeepMind has also given away the blueprints for wavenet to anyone who wants them — in the form of a research paper released last year — meaning everyone in the industry can start on the same page.
Cahill says he’s carefully picked the right talent and built up a network of contacts in voice technology since 2002, chairing a speech-synthesis special interest group and helping to organize conferences.
Another speech scientist who worked on wavenet at DeepMind and left earlier this year, says the latest demo from Voysis is “essentially glitch-free.” Anthony Tomlinson doesn’t work with Voysis but verified its improvements to Forbes.
Over time, wavenet can also make it possible for software to manipulate existing voices into saying things that are close to natural, without having to spend hours in a booth recording thousands of units of speech.
In November 2016Adobe teased a new application called Project Voco which it called “Photoshop for speech.” In a demo, the company edited a recording of a man saying “I kissed my dogs and my wife” and manipulated it to say “I kissed Jordan three times.” Adobe has yet to release Voco to consumers.
Cahill recalls one of the authors of DeepMind’s landmark wavenet research paper, Heiga Zen, saying at a Speech Synthesis workshop one year ago that within three years, people wouldn’t be able to identify if they were listening to a machine or human.
Video games could include an infinite array of natural-sounding dialogue, and advertisers wouldn’t have to keep bringing voice actors - or celebrities - into the studio.
Lyrebird claims it can recreate any voice using just one minute of sample audio
Using artificial intelligence, companies like Google have been able to create incredibly life-like synthesized voices, while Adobe has unveiled its own prototype software called Project VoCo that can edit human speech like Photoshop tweaks digital images.
The resulting speech can be put to a wide range of uses, says Lyrebird, including “reading of audio books with famous voices, for connected devices of any kind, for speech synthesis for people with disabilities, for animation movies or for video game studios.” It takes quite a bit of computing power to generate a voice-print, but once done, the speech is easy to make — Lyrebird can create one thousand sentences in less than half a second.
In an “Ethics” section on the company’s website, Lyrebird’s founders (three university students from the University of Montréal) acknowledge that their technology “raises important societal issues,” including bringing into question the veracity of audio recordings used in court.
Their solution is to release the technology publicly and make it “available to anyone.” That way, they say, the damage will be lessened because “everyone will soon be aware that such technology exists.” Speaking to The Verge, Alexandre de Brébisson of Lyrebird adds: “The situation is comparable to Photoshop.
- On 6. maj 2021
Google’s newest voice AI is indistinguishable from real human speech
Google's newest voice AI is indistinguishable from real human speech Google's demonstration of the latest artificial intelligence studies applied to text-to-speech ...
Google’s voice generating AI is now indistinguishable from humans
In a recently published research paper by the people at Google, the team introduces details to the impressive speech system called Tacotron 2. In the paper ...
Google's AI assistant apes the human voice
Subscribe to France 24 now: FRANCE 24 live news stream: all the latest news 24/7 Google's new "Duplex" AI .
Google's AI-driven TTS engine
Google has continued to improve its text-to-speech engine and the crew compares audio clips to see if they can guess which clip is artificial and which is an ...
Google's New Voice (AI) | Shelly Palmer on CNN
Shelly Palmer speaks with Paula Newton about Google Duplex, Google AI and the future of Automatic Speech Recognition and Natural Language ...
Breaking News - Baidu's creepy new AI can accurately mimic your voice
The Chinese answer to Google can now clone your voice using AI after hearing you talk for just one minute.Baidu, who created this creepy technology, says it ...
Synthesizing natural voice using Google's Tacotron-2 open sourced tensorflow implementation
When it comes to AI technologies, Google is top of the line. In 2017, Google published its paper "Tacotron: Towards End-to-End Speech Synthesis" That ...
Tacotron2 Google’s Newest Text To Speech AI Talks Just Like Us!
Google's newest text to speech AI talks naturally, just like us! Google published a research paper this month elaborating it's new text-to-speech AI Tacotron 2 that ...
Tacotron 2 - THE BEST TEXT TO SPEECH AI YET!
In this video, I am going to talk about the new Tacotron 2- google's the text to speech system that is as close to human speech till date. If you like the video, ...
Prophecy Alert: "Google's Tacotron 2 "AI Speaks" Artificial Intelligence 666
"Prophecy Alert as Google Tacotron 2 software will cause "AI to Speak" 666 the Beast Speaks" Spread the Word ..