Publication Details
Front-End Compensation Methods for LVCSR Under Lombard Effect
speech recognition, Lombard effect, UT-Scope database, bottleneck features, quantile-based cepstral distribution normalization, histogram equalization
This paper describes a Front-End Compensation Methods for LVCSR (Large Vocabulary Continuous Speech Recognition) Under Lombard Effect.
This study analyzes the impact of noisy background variations and Lombard effect (LE) on large vocabulary continuous speech recognition (LVCSR). Robustness of several front-end feature extraction strategies combined with state-of-the-art feature distribution normalizations is tested on neutral and Lombard speech from the UT-Scope database presented in two types of background noise at various levels of SNR. An extension of a bottleneck (BN) front-end utilizing normalization of both critical band energies (CRBE) and BN outputs is proposed and shown to provide a competitive performance compared to the best MFCC-based system. A novel MFCC-based BN front-end is introduced and shown to outperform all other systems in all conditions considered (average 4.1% absolute WER reduction over the second best system). Additionally, two phenomena are observed: (i) combination of cepstral mean subtraction and recently established RASTALP filtering significantly reduces transient effects of RASTA band-pass filtering and increases ASR robustness to noise and LE; (ii) histogram equalization may benefit from utilizing reference distributions derived from pre-normalized rather than raw training features, and also from adopting distributions from different front-ends.
@INPROCEEDINGS{FITPUB9756, author = "Hynek Bo\v{r}il and Franti\v{s}ek Gr\'{e}zl and H. John Hansen", title = "Front-End Compensation Methods for LVCSR Under Lombard Effect", pages = "1257--1260", booktitle = "Proceedings of Interspeech 2011", journal = "Proceedings of Interspeech - on-line", volume = 2011, number = 8, year = 2011, location = "Florence, IT", publisher = "International Speech Communication Association", ISBN = "978-1-61839-270-1", ISSN = "1990-9772", language = "english", url = "https://www.fit.vut.cz/research/publication/9756" }