Publication Details
Combination of Multilingual and Semi-Supervised Training for Under-Resourced Languages
feature extraction, neural networks, stacked bottle-neck, multilingual training, semi-supervised training
This article is about a combination of Multilingual and Semi-Supervised Training for Under-Resourced Languages.
Multilingual training of neural networks for ASR is widely studied these days. It has been shown that languages with little training data can benefit largely from the multilingual resources for training. The use of unlabeled data for the neural network training in semi-supervised manner has also improved the ASR system performance. Here, the combination of both methods is presented. First, multilingual training is performed to obtain an ASR system to automatically transcribe the unlabeled data. Then, the automatically transcribed data are added. Two neural networks are trained - one from random initialization and one adapted from multilingual network - to evaluate the effect of multilingual training under presence of larger amount of training data. Further, the CMLLR transform is applied in the middle of the stacked Bottle-Neck neural network structure. As the CMLLR rotates the features to better fit given model, we evaluated whether it is better to adapt the existing NN on the CMLLR features or if it is better to train it from random initialization. The last step in our training procedure is the fine-tuning on the original data. [Search]
@INPROCEEDINGS{FITPUB10716, author = "Franti\v{s}ek Gr\'{e}zl and Martin Karafi\'{a}t", title = "Combination of Multilingual and Semi-Supervised Training for Under-Resourced Languages", pages = "820--824", booktitle = "Proceedings of Interspeech 2014", year = 2014, location = "Singapore, SG", publisher = "International Speech Communication Association", ISBN = "978-1-63439-435-2", language = "english", url = "https://www.fit.vut.cz/research/publication/10716" }