Publication Details
Analysis of the DNN-Based SRE Systems in Multi-language Conditions
Matějka Pavel, Ing., Ph.D. (DCGM FIT BUT)
Glembek Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
DNN, Multi-Language, Speaker Recognition
This paper analyzes the behavior of our state-of-the-art Deep Neural Network/i-vector/PLDA-based speaker recognition systems in multi-language conditions. On the "Language Pack" of the PRISM set, we evaluate the systems performance using the NISTs standard metrics. We show that not only the gain from using DNNs vanishes, nor using dedicated DNNs for target conditions helps, but also the DNN-based systems tend to produce de-calibrated scores under the studied conditions. This work gives suggestions for directions of future research rather than any particular solutions to these issues.
In this work, we have studied the behavior of the DNN techniques in SRE i-vector/PLDA systems, currently considered to be state-ofthe- art, as evaluated on the most common NIST SRE English test sets, such as the NIST SRE 2010, condition 5.
@INPROCEEDINGS{FITPUB11309, author = "Ond\v{r}ej Novotn\'{y} and Pavel Mat\v{e}jka and Ond\v{r}ej Glembek and Old\v{r}ich Plchot and Franti\v{s}ek Gr\'{e}zl and Luk\'{a}\v{s} Burget and Jan \v{C}ernock\'{y}", title = "Analysis of the DNN-Based SRE Systems in Multi-language Conditions", pages = "199--204", booktitle = "Proceedings of SLT 2016", year = 2016, location = "San Diego, US", publisher = "IEEE Signal Processing Society", ISBN = "978-1-5090-4903-5", doi = "10.1109/slt.2016.7846265", language = "english", url = "https://www.fit.vut.cz/research/publication/11309" }