Publication Details
Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
Glembek Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
van Leeuwen David (TNO)
Matějka Pavel, Ing. (UREL FEEC BUT)
Schwarz Petr, Ing., Ph.D. (DCGM FIT BUT)
Strasheim Albert (USB)
speaker recognition
The paper describes the fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006.
This paper describes and discusses the `STBU' speaker recognition system, which performed well in the NIST Speaker Recognition Evaluation 2006 (SRE). STBU is a consortium of 4 partners: Spescom DataVoice (South Africa), TNO (The Netherlands), BUT (Czech Republic) and University of Stellenbosch (South Africa). The STBU system was a combination of three main kinds of sub-systems: (1) GMM, with shorttime MFCC or PLP features, (2) GMM-SVM, using GMM mean supervectors as input to an SVM, and (3) MLLR-SVM, using MLLR speaker adaptation coefficients derived from an English LVCSR system. All sub-systems made use of supervector subspace channel compensation methodsóeither eigenchannel adaptation or nuisance attribute projection. We document the design and performance of all sub-systems, as well as their fusion and calibration via logistic regression. Finally, we also present a cross-site fusion that was done with several additional systems from other NIST SRE-2006 participants.
@ARTICLE{FITPUB8470, author = "Niko Br{\"{u}}mmer and Luk\'{a}\v{s} Burget and Jan \v{C}ernock\'{y} and Ond\v{r}ej Glembek and Franti\v{s}ek Gr\'{e}zl and Martin Karafi\'{a}t and David Leeuwen van and Pavel Mat\v{e}jka and Petr Schwarz and Albert Strasheim", title = "Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006", pages = "2072--2084", journal = "IEEE Transactions on Audio, Speech, and Language Processing", volume = 15, number = 7, year = 2007, ISSN = "1558-7916", language = "english", url = "https://www.fit.vut.cz/research/publication/8470" }