Building voice datasets: Difference between revisions

@@ Line 1: / Line 1: @@
-[[Category:Dataset]]
 == Building a quality dataset for your STT purposes ==
 There are several datasets you can build for STT purposes.
@@ Line 32: / Line 30: @@
 === Quality, or can you hear me now? ===
 Common Voice has a massive range of samples within it.  In addition to simply collecting sentences, users can also verify samples to confirm they're a match to the expected transcript. This has a two-fold benefit: sentences that don't match the transcript can be noted for exclusion, and the poorest quality samples that are unintelligible or have other audio quality problems can be noted for exclusion.
+== References ==
+[[Category:Dataset]]
+[[Category:Lessons learned]]