Publication Details
BUT- PT System Description for NIST LRE 2017
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Novotný Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Cumani Sandro (POLITO)
Lozano Díez Alicia (UAM)
Slavíček Josef (Phonexia)
Diez Sánchez Mireia, M.Sc., Ph.D. (DCGM FIT BUT)
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Glembek Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Kamsali Veera Mounika (DCGM FIT BUT)
Silnova Anna, MSc., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Ondel Yang Lucas Antoine Francois, Mgr., Ph.D. (DCGM FIT BUT)
Kesiraju Santosh (DCGM FIT BUT)
Rohdin Johan A., Dr. (DCGM FIT BUT)
speech recognition, language recognition
This article is about the BUT - PT System Description for the NIST LRE 2017 evaluation. We have built over 30 systems for this evaluation with the main focus to build a single best system. We experimented with denoising NN, automatic discovery units, different flavors of phonotactic systems, different backends, different sizes of i-vector systems, different BN features, NN embeddings and frame level language classifiers. The evaluation plan stated "Teams are encouraged to report whether and how having access to the development set helped improve the performance". The development data helped mainly in the final classifier and also helped in the decision process which techniques to use and which to fuse because our test set consisted of this data.
Our submission is a collaborative effort of BUT, Politecnico di Torino, Universidad Autonoma de Madrid and Phonexia. The main body of work was conducted during end of September and beginning of October 2017 when the whole team met in Brno and all members were closely working together with common datasets. All of our individual systems rely on the bottleneck features[1, 2] (BNF) as frontends. Most of our systems are still based on i-vectors and subsequent generative classifier. We also complement the classical i-vector based systems with a system based on embeddings obtained from discriminatively trained end-toend LRE system. Finally, the primary submission is a fusion of four systems where we utilize two different BNF extractors, non-linear processing of i-vectors and embeddings obtained from the discriminative system.
@INPROCEEDINGS{FITPUB11655, author = "Pavel Mat\v{e}jka and Old\v{r}ich Plchot and Ond\v{r}ej Novotn\'{y} and Sandro Cumani and Alicia D\'{i}ez Lozano and Josef Slav\'{i}\v{c}ek and Mireia S\'{a}nchez Diez and Franti\v{s}ek Gr\'{e}zl and Ond\v{r}ej Glembek and Mounika Veera Kamsali and Anna Silnova and Luk\'{a}\v{s} Burget and Francois Antoine Lucas Yang Ondel and Santosh Kesiraju and A. Johan Rohdin", title = "BUT- PT System Description for NIST LRE 2017", pages = "1--6", booktitle = "Proceedings of NIST Language Recognition Workshop 2017", year = 2017, location = "Orlando, Florida, US", publisher = "National Institute of Standards and Technology", language = "english", url = "https://www.fit.vut.cz/research/publication/11655" }