in

Synthesizing natural voice using Google's Tacotron open sourced tensorflow implementation



When it comes to AI technologies, Google is top of the line.
In 2017, Google published its paper “Tacotron: Towards End-to-End Speech Synthesis”
That simplifies the process of teaching an AI to speak.
You can read the paper here :

In this video, I’m using the open-sourced TensorFlow implementation of the Tacotron system (Unofficial) to synthesize natural voice.
The pre-trained model available on GitHub is trained around ~400k steps.
There’s not much difference even if you train your network up to 800k steps until someone improves the network.

Tacotron on Github: https://github.com/keithito/tacotron/

The current audio samples generated aren’t as natural as Google’s Tacotron samples, but we’ll get there soon as long as there are intelligent people ( like keithito ) out there coding day & night.

Thanks for Watching!

Share this: