Publication Details

Domain adaptation via within-class covariance correction in I-vector based speaker recognition systems

GLEMBEK Ondřej, MA Jeff, MATĚJKA Pavel, ZHANG Bing, PLCHOT Oldřich, BURGET Lukáš and MATSOUKAS Spyros. Domain Adaptation Via Within-class Covariance Correction in I-Vector Based Speaker Recognition Systerms. In: Proceedings of ICASSP 2014. Florencie: IEEE Signal Processing Society, 2014, pp. 4060-4064. ISBN 978-1-4799-2892-7.
Czech title
Adaptace na doménu pomocí vnitro-třídní kovarianční opravy v systému pro rozpoznávání mluvčího založeném na i-vektorech
Type
conference paper
Language
english
Authors
Glembek Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Ma Jeff (Raytheon BBN)
Matějka Pavel, Ing., Ph.D. (DCGM FIT BUT)
Zhang Bing (Raytheon BBN)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Matsoukas Spyros (Raytheon BBN)
URL
Keywords

speaker recognition, i-vectors, source normalization, LDA, inter-dataset variability compensation

Abstract

In this paper, we have shown a technique of within-class correction for Linear Discriminant Analysis estimation. We have shown that when correct dataset clustering is used, adapting the within-class covariance of LDA by low-rank between-dataset covariance matrix can lead to significant improvement of the system, namely up to 70% in the Domain Adaptation Task, and 17.5% and 36% relative in the RATS unmatched and semi-matched tasks, respectively. The dataset clustering problem gave us an interesting direction for future research.

Annotation

In this paper we propose a technique of Within-Class Covariance Correction (WCC) for Linear Discriminant Analysis (LDA) in Speaker Recognition to perform an unsupervised adaptation of LDA to an unseen data domain, and/or to compensate for speaker population difference among different portions of LDA training dataset. The paper follows on the study of source-normalization and interdatabase variability compensation techniques which deal with multimodal distribution of i-vectors. On the DARPA RATS (Robust Automatic Transcription of Speech) task, we show that, with two hours of unsupervised data, we improve the Equal-Error Rate (EER) by 17.5%, and 36% relative on the unmatched and semi-matched conditions, respectively. On the Domain Adaptation Challenge we show up to 70% relative EER reduction and we propose a data clustering procedure to identify the directions of the domain-based variability in the adaptation data.

Published
2014
Pages
4060-4064
Proceedings
Proceedings of ICASSP 2014
Conference
The 39th International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Florencie, IT
ISBN
978-1-4799-2892-7
Publisher
IEEE Signal Processing Society
Place
Florencie, IT
DOI
UT WoS
000343655304011
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB10555,
   author = "Ond\v{r}ej Glembek and Jeff Ma and Pavel Mat\v{e}jka and Bing Zhang and Old\v{r}ich Plchot and Luk\'{a}\v{s} Burget and Spyros Matsoukas",
   title = "Domain adaptation via within-class covariance correction in I-vector based speaker recognition systems",
   pages = "4060--4064",
   booktitle = "Proceedings of ICASSP 2014",
   year = 2014,
   location = "Florencie, IT",
   publisher = "IEEE Signal Processing Society",
   ISBN = "978-1-4799-2892-7",
   doi = "10.1109/ICASSP.2014.6854359",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/10555"
}
Back to top