Diarization

From Voice Technology Wiki
Revision as of 16:00, 5 December 2021 by Thorsten (talk | contribs) (Formatting)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Speaker diarisation[1] (or diarization), or speaker separation is the process of partitioning an input audio stream into homogeneous segments according to the speaker identity. It can enhance the readability of an automatic speech transcription by structuring the audio stream into speaker turns and, when used together with speaker recognition systems, by providing the speaker’s true identity.

References[edit | edit source]