Building voice datasets: Difference between revisions
Style adjustments and added dataset category. |
Added lessons learned category. |
||
| (2 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
== Building a quality dataset for your STT purposes == | == Building a quality dataset for your STT purposes == | ||
There are several datasets you can build for STT purposes. | There are several datasets you can build for STT purposes. | ||
| Line 32: | Line 30: | ||
=== Quality, or can you hear me now? === | === Quality, or can you hear me now? === | ||
Common Voice has a massive range of samples within it. In addition to simply collecting sentences, users can also verify samples to confirm they're a match to the expected transcript. This has a two-fold benefit: sentences that don't match the transcript can be noted for exclusion, and the poorest quality samples that are unintelligible or have other audio quality problems can be noted for exclusion. | Common Voice has a massive range of samples within it. In addition to simply collecting sentences, users can also verify samples to confirm they're a match to the expected transcript. This has a two-fold benefit: sentences that don't match the transcript can be noted for exclusion, and the poorest quality samples that are unintelligible or have other audio quality problems can be noted for exclusion. | ||
== References == | |||
[[Category:Dataset]] | |||
[[Category:Lessons learned]] | |||