Speaking issues Coqui TTS Tacotron2 DDC model
General
Most models are trained with a dot, exclamation or question mark at the end. So always end a sentence to avoid model synthesizing weird output.
- TTS model: tts_models/en/ljspeech/tacotron2-DDC
- Vocoder model: vocoder_models/en/ljspeech/hifigan_v2
Input string formatting
Phrases ending in "ah"
"ah" at end of sentence generally produces strange results. Short names produce a 12 second clip.
Examples:
- Nelson Mandela
- pergola
Mitigation
If at the end of the input, adding punctuation to the end synthesizes correctly:
Example "Nelson Mandela" > "Nelson Mandela."
Acronyms
To speak acronyms as letters it needs to be formatted as:
"A. B. C. news"
Not:
"ABC news" "A.B.C. news"
Mispronounced Words
- video