Publication Details

Semi-Supervised Bootstrapping Approach For Neural Network Feature Extractor Training

GRÉZL František and KARAFIÁT Martin. Semi-Supervised Bootstrapping Approach For Neural Network Feature Extractor Training. In: Proceedings of ASRU 2013. Olomouc: IEEE Signal Processing Society, 2013, pp. 470-475. ISBN 978-1-4799-2755-5.
Czech title
Částečně dohlížená bootstrapová metoda pro trénování neuronových sítí pro extrakci příznaků
Type
conference paper
Language
english
Authors
URL
Keywords

Semi-supervised training, bootstrapping, bottle-neck features

Abstract

This paper presents bootstrapping approach for training the Bottle-Neck neural network feature extractor which provides features  for subsequent GMM-HMM recognizer. One can use this recognizer to automatically transcribe the unsupervised data and assign the confidence of the transcription. Based on the confidence, segments are selected and mixed with supervised data and new NNs are trained. The automatic transcription can recover 40-55% in comparison to manually transcribed data. This is 3 to 5% absolute improvement over NN trained on supervised data only. Using 70-85% of automatically transcribed segments with the highest confidence was found optimal to achieve this result. Dropping the rest of the data prevents training on low quality transcripts.

Annotation

This paper presents bootstrapping approach for neural network training. The neural networks serve as bottle-neck feature extractor for subsequent GMM-HMM recognizer. The recognizer is also used for transcription and confidence assignment of untranscribed data. Based on the confidence, segments are selected and mixed with supervised data and new NNs are trained. With this approach, it is possible to recover 40-55% of the difference between partially and fully transcribed data (3 to 5% absolute improvement over NN trained on supervised data only). Using 70-85% of automatically transcribed segments with the highest confidence was found optimal to achieve this result.

Published
2013
Pages
470-475
Proceedings
Proceedings of ASRU 2013
Conference
IEEE 2013 Workshop on Automatic Speech Recognition and Understanding, Olomouc, CZ
ISBN
978-1-4799-2755-5
Publisher
IEEE Signal Processing Society
Place
Olomouc, CZ
BibTeX
@INPROCEEDINGS{FITPUB10469,
   author = "Franti\v{s}ek Gr\'{e}zl and Martin Karafi\'{a}t",
   title = "Semi-Supervised Bootstrapping Approach For Neural Network Feature Extractor Training",
   pages = "470--475",
   booktitle = "Proceedings of ASRU 2013",
   year = 2013,
   location = "Olomouc, CZ",
   publisher = "IEEE Signal Processing Society",
   ISBN = "978-1-4799-2755-5",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/10469"
}
Back to top