Publication Details

13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE

MATĚJKA Pavel, PLCHOT Oldřich, GLEMBEK Ondřej, BURGET Lukáš, ROHDIN Johan A., ZEINALI Hossein, MOŠNER Ladislav, SILNOVA Anna, NOVOTNÝ Ondřej, DIEZ Sánchez Mireia and ČERNOCKÝ Jan. 13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE. Computer Speech and Language, vol. 2020, no. 63, pp. 1-15. ISSN 0885-2308. Available from: https://www.sciencedirect.com/science/article/pii/S0885230819302797?via%3Dihub
Czech title
13 let výzkumu rozpoznávání řečníka na VUT s dlouhodobou analýzou na NIST SRE
Type
journal article
Language
english
Authors
URL
Keywords

Speaker recognition, NIST, Evaluations, GMM, Eigen-channel, compensation, JFA, I-vectors, DNN Embedding, X-vectors

Abstract

In this paper, we present a brief history and a "longitudinal study" of all important milestone modelling techniques used in text independent speaker recognition since Brno University of Technology (BUT) first participated in the NIST Speaker Recognition Evaluation (SRE) in 2006-GMM MAP, GMM MAP with eigen-channel adaptation, Joint Factor Analysis, i-vector and DNN embedding (x-vector). To emphasize the historical context, the techniques are evaluated on all NIST SRE sets since 2004 on a time-machine principle, i.e. a system is always trained using all data available up till the year of evaluation. Moreover, as user-contributed audiovisual content dominates nowadays Internet, we representatively include the Speakers In The Wild (SITW) and VOiCES challenge datasets in the evaluation of our systems. Not only we present a comparison of the modelling techniques, but we also show the effect of sampling frequency.

Published
2020
Pages
1-15
Journal
Computer Speech and Language, vol. 2020, no. 63, ISSN 0885-2308
Publisher
Elsevier Science
DOI
UT WoS
000534481900003
EID Scopus
BibTeX
@ARTICLE{FITPUB12211,
   author = "Pavel Mat\v{e}jka and Old\v{r}ich Plchot and Ond\v{r}ej Glembek and Luk\'{a}\v{s} Burget and A. Johan Rohdin and Hossein Zeinali and Ladislav Mo\v{s}ner and Anna Silnova and Ond\v{r}ej Novotn\'{y} and Mireia S\'{a}nchez Diez and Jan \v{C}ernock\'{y}",
   title = "13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE",
   pages = "1--15",
   journal = "Computer Speech and Language",
   volume = 2020,
   number = 63,
   year = 2020,
   ISSN = "0885-2308",
   doi = "10.1016/j.csl.2019.101035",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/12211"
}
Back to top