Publication Details
Improved MLP Structures for Data-Driven Feature Extraction for ASR
Chen Barry, Msc. (ICSI Berkeley)
Grézl František, Ing., Ph.D. (DCGM FIT BUT)
Morgan Nelson, Prof. (ICSI Berkeley)
feature extraction, MLP structure, time-frequency patterns
Data-driven feature extraction using improved MLP structure for ASR. Four-layer MLPs are used in this feature extraction. It is shown that the the first hidden layer of a four-layer MLP is able to detect some basic patterns from the time-frequency plane.
In this paper, we present our recent progress on multi-layer perceptron (MLP) based data-driven feature extraction using improved MLP structures. Four-layer MLPs are used in this study. Different signal processing methods are applied before the input layer of the MLP. We show that the first hidden
layer of a four-layer MLP is able to detect some basic patterns from the time-frequency plane. KLT-based dimension reduction along time is applied as a modulation frequency filter. The new feature extraction was tested on a large
vocabulary continuous speech recognition (LVCSR) task using the NIST 2001 evaluation set. We achieved 11.6% relative word error rate (WER) reduction compared to the traditional PLP-based baseline feature. This is also a
significant improvement compared to our previously published results on the same task using MLP-based features with three-layer MLPs.
@INPROCEEDINGS{FITPUB7909, author = "Qifeng Zhu and Barry Chen and Franti\v{s}ek Gr\'{e}zl and Nelson Morgan", title = "Improved MLP Structures for Data-Driven Feature Extraction for ASR", pages = 4, booktitle = "Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology", journal = "European Speech Communication", year = 2005, location = "Lisabon, PT", ISSN = "1018-4074", language = "english", url = "https://www.fit.vut.cz/research/publication/7909" }