Text cleaning
Text cleaning is the process of removing some characters from text corpus before recording them. Here are some examples of text to be cleaned. When you are using a TTS model you have to clean the text before giving it to the synthesizer if text cleaning is not included in the TTS process itself.
Numbers[edit | edit source]
Numbers should be replaced with the written form.
You have 3 timers set. ==> You have three timers set.
Time and date[edit | edit source]
Today is monday, november 3rd. ==> Today is monday, the third.
It is 2021. ==> It is twentytwentyone.
Abbreviations[edit | edit source]
Let's go to Dr. John Doe. ==> Let's go to doctor John Doe.
Weight is 5kg. ==> Weight is five kilogram.