Publication Details
Multilingual BLSTM and Speaker-Specific Vector Adaptation in 2016 BUT BABEL SYSTEM
Baskar Murali K. (DCGM FIT BUT)
Matějka Pavel, Ing., Ph.D. (DCGM FIT BUT)
Veselý Karel, Ing., Ph.D. (DCGM FIT BUT)
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
Automatic speech recognition, Multilingual neural networks, Bidirectional Long Short Term Memory, i-vectors, Sequence Summarizing Neural Networks.
This paper provides an extensive summary of BUT 2016 system for the last Babel evaluations. It concentrates on multi-lingual training of both DNN-based features and acoustic models and on the lowdimensional to speaker adaptation.
This paper provides an extensive summary of BUT 2016 system for the last IARPA Babel evaluations. It concentrates on multi-lingual training of both deep neural network (DNN)-based feature extraction and acoustic models including multilingual training of bidirectional Long Short Term memory networks. Next, two low-dimensional vector approaches to speaker adaptation are investigated: i-vectors and sequence-summarizing neural networks (SSNN). The results provided on three Babel Year 4 languages show clear advantage of both approaches in case limited amount of training data is available. The time necessary for the development of a new system is addressed too, as some of the investigated techniques do not require extensive re-training of the whole system.
@INPROCEEDINGS{FITPUB11310, author = "Martin Karafi\'{a}t and K. Murali Baskar and Pavel Mat\v{e}jka and Karel Vesel\'{y} and Franti\v{s}ek Gr\'{e}zl and Jan \v{C}ernock\'{y}", title = "Multilingual BLSTM and Speaker-Specific Vector Adaptation in 2016 BUT BABEL SYSTEM", pages = "637--643", booktitle = "Proceedings of SLT 2016", year = 2016, location = "San Diego, US", publisher = "IEEE Signal Processing Society", ISBN = "978-1-5090-4903-5", doi = "10.1109/SLT.2016.7846330", language = "english", url = "https://www.fit.vut.cz/research/publication/11310" }