Jump to content

Vosk

From Open Voice Technology Wiki
Revision as of 11:32, 3 December 2021 by Florian (talk | contribs) (Created Vosk page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Vosk is an open-source speech recognition toolkit by Alphacephei[1]. Key features are:

  1. Supports 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish. More to come.
  2. Works offline, even on lightweight devices - Raspberry Pi, Android, iOS
  3. Installs with simple pip3 install vosk
  4. Portable per-language models are only 50Mb each, but there are much bigger server models available.
  5. Provides streaming API for the best user experience (unlike popular speech-recognition python packages)
  6. There are bindings for different programming languages, too - java/csharp/javascript etc.
  7. Allows quick reconfiguration of vocabulary for best accuracy.
  8. Supports speaker identification beside simple speech recognition.
Cookies help us deliver our services. By using our services, you agree to our use of cookies.