Publication Details
BUT BABEL System for Spontaneous Cantonese
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Hannemann Mirko, Dipl.-Ing. (DCGM FIT BUT)
Veselý Karel, Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
speech recognition, discriminative training, bottle-neck neural networks, region-dependent transforms
This article describes the novel things we have brought to our BABEL Cantonese system include 6-layer Stacked Bottle-Neck features and using f0 at the input of this NN. We have also investigated into robustness of SBN training (silence, normalization) and shown an efficient combination with PLP and (again!) F0 features using Region-Dependent transforms. Last by not least, a combination of RDT with another popular adaptation technique (SAT) was shown beneficial.
This paper presents our work on speech recognition of Cantonese spontaneous telephone conversations. The key-points include feature extraction by 6-layer Stacked Bottle-Neck neural network and using fundamental frequency information at its input. We have also investigated into robustness of SBN training (silence, normalization) and shown an efficient combination with PLP using Region-Dependent transforms. A combination of RDT with another popular adaptation technique (SAT) was shown beneficial. The results are reported on BABEL Cantonese data.
@INPROCEEDINGS{FITPUB10423, author = "Martin Karafi\'{a}t and Franti\v{s}ek Gr\'{e}zl and Mirko Hannemann and Karel Vesel\'{y} and Jan \v{C}ernock\'{y}", title = "BUT BABEL System for Spontaneous Cantonese", pages = "2589--2593", booktitle = "Proceedings of Interspeech 2013", journal = "Proceedings of the 14th Annual Conference of the International Speech Communication Association (Interspeech 2013).", number = 8, year = 2013, location = "Lyon, FR", publisher = "International Speech Communication Association", ISBN = "978-1-62993-443-3", ISSN = "2308-457X", language = "english", url = "https://www.fit.vut.cz/research/publication/10423" }