Recording tipps

When you plan to record a voice dataset to be used for a TTS model training you should check these tips and tricks:

Use a good microphone and a quiet recording room setup (no computer fans, air conditioning, ...)
Use a text corpus with cleaned numbers/abbreviations and good phoneme coverage
Read in a neutral style, but with a natural speech flow and do not swallow up letters
Adjust tone and pitch with punctuation
Use a constant recording speed
Check your recordings regularly in high volume for background noise
Take breaks regularly and do not record more than four hours a day
Record error free
Investing in a quality interface and mic can make a big difference in quality. A 24 bit 96khz interface with a large diaphragm condenser can be had for about $200 USD.
Record at the highest quality level practical. You can convert to lesser formats later, but you cannot up convert cleanly
Review your work at regular intervals and compare with previous recording to ensure consistent quality
Do not be afraid to ask for help! Getting feedback on your data early on can help prevent wasted effort.

Navigation menu