Skip to main content

How does text to speech technology work?

Text to speech, or in short, TTS, is for sure a game-changing technology. It has been used by many businesses and individuals around the world. Speaking of individuals, it is worth mentioning that if it was not for this technology, we could not even communicate with one of the most prominent scientists of our time, Stephen Hawking, let alone the fact that we would be deprived of his scientific achievements.


But what is TTS and how it works?

Generally, text to speech conversion or speech synthesis includes text analysis, linguistic analysis and waveform generation stages. Also, we can consider a text to speech engine having two major parts; a front-end and a back-end. In the back-end a pre-processing happens, in which text and symbols convert to written-out words. Then, in the front-end of engine, the process of assigning phonetic transcriptions to words happens. This process is called text-to-phoneme.

There are two main qualities that a text-to-speech engine performance is measured by; naturalness and intelligibility. In other words, a high-performing TTS engine is one that produces a comprehensible human-like speech. Various engines are struggling to maximize these two qualities. Thus, the new engines are not comparable in naturalness to the machine like systems that most of us have experienced using in the past.

The technology that Artivle is using is called WaveNet generative model, which sounds more natural than the best existing Text-to-Speech systems, reducing the gap with human performance by over 50%. WaveNet is a deep generative model of raw audio waveforms. These audio waveforms evolve over time and can be trained and create new speech-like waveforms. These waveforms even include realistic breaths and lip smacking.

With all this said, there is still room for improvement and making text to speech conversion a more human-like experience. Though, it is not hard to predict technology would rapidly evolve to make our lives easier using TTS. 

What do you think of text to speech technology? Can this technology substitute human reading and voice over in the future? Please let us know in the comments section.

Comments

Popular posts from this blog

Why should I choose text to speech instead of human narration?

It is not that easy to say whether human narrators are better or text to speech tech . It is better to consider the pros and cons of each of these solutions. We can consider three main measures in our comparison. Cost, quality and time. Cost TTS is the winner in cost comparison, as one can convert 10 articles of 10,000 characters each (about 40 - 60 minutes of audio) by Artivle’s regular package of only $20, whereas hiring a voice talent for narrating up to 5 minutes of duration starts from $275, which is more than 10 times higher than TTS! Your browser does not support iframes. Quality Even considering the latest technology of text to speech conversion, there is a gap between human voice-over and TTS voice quality. There is still room for improvement to add inflections and warmth of human voice to the engine produced voice. With this said, in many cases the requirement of the user is met in TTS tech. The main purpose is to convey a message to the audience. Time

Light on the new tech for visually impaired

According to a research in 2016 and based on the World Health Organization Quality of Life parameters (WHOQOL-Bref), it is observed that low vision participants had lower quality of life scores in all four domains, i.e. environmental, physical, social and psychological domains. Empowering the visually impaired in reading capability is one of the ways to increase the quality of life of this important part of our society. Reading can have a positive impact on the social and psychological aspects of the quality of life; as it improves social skills, grows self-esteem and reduces anxiety and depression. Thus, using new technology to make them capable of reading can lead to the enhancement of their quality of life and help them be more socially engaged and psychologically healthy. Your browser does not support iframes. Assistive technology for visually impaired is improving everyday and creative businesses and startups are finding novel ways to make life easier for people with