A team from Google has published a peer-reviewed scientific paper describing the principle of human speech synthesis – https://arxiv.org/abs/1712.05884. The examples are interesting –
https://google.github.io/tacotron/publications/tacotron2/ There, you can hear both human-spoken and system-synthesized speech. Try to tell them apart. I couldn’t.
