Publication Details

Approaches to automatic LEXICON learning with limited training examples

GOEL Nagendra K., THOMAS Samuel, AGARWAL Mohit, AKYAZI Pinar, BURGET Lukáš, FENG Kai, GHOSHAL Arnab, GLEMBEK Ondřej, KARAFIÁT Martin, POVEY Daniel, RASTROW Ariya, ROSE Richard and SCHWARZ Petr. Approaches to automatic lexicon learning with limited training examples. In: Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 5094-5097. ISBN 978-1-4244-4296-6. ISSN 1520-6149.
Czech title
Přístupy k automatickému učení slovníku s omezenými trénovacími daty
Type
conference paper
Language
english
Authors
Goel Nagendra K. (GOVIVACE)
Thomas Samuel (JHU)
Agarwal Mohit (IIIT)
Akyazi Pinar (UBOGAZ)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Feng Kai (HKUST)
Ghoshal Arnab (UEDIN)
Glembek Ondřej, Ing., Ph.D. (DCGM FIT BUT)
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
Povey Daniel (JHU)
Rastrow Ariya (JHU)
Rose Richard (MCGILL)
Schwarz Petr, Ing., Ph.D. (DCGM FIT BUT)
URL
Keywords

Lexicon Learning, LVCSR

Abstract

The paper is on approaches to automatic lexicon learning with limited training examples. We use a combination of lexicon learning techniques.

Annotation

Preparation of a lexicon for speech recognition systems can be a significant effort in languages where the written form is not exactly phonetic. On the other hand, in languages where the written form is quite phonetic, some common words are often mispronounced. In this paper, we use a combination of lexicon learning techniques to explore whether a lexicon can be learned when only a small lexicon is available for boot-strapping. We discover that for a phonetic language such as Spanish, it is possible to do that better than what is possible from generic rules or hand-crafted pronunciations. For a more complex language such as English, we find that it is still possible but with some loss of accuracy.

Published
2010
Pages
5094-5097
Journal
Proc. International Conference on Acoustics, Speech, and Signal Processing, vol. 2010, no. 3, ISSN 1520-6149
Proceedings
Proc. International Conference on Acoustics, Speech, and Signal Processing
Conference
International Conference on Acoustics, Speech, and Signal Processing 2010, Dallas, US
ISBN
978-1-4244-4296-6
Publisher
IEEE Signal Processing Society
Place
Dallas, US
BibTeX
@INPROCEEDINGS{FITPUB9309,
   author = "K. Nagendra Goel and Samuel Thomas and Mohit Agarwal and Pinar Akyazi and Luk\'{a}\v{s} Burget and Kai Feng and Arnab Ghoshal and Ond\v{r}ej Glembek and Martin Karafi\'{a}t and Daniel Povey and Ariya Rastrow and Richard Rose and Petr Schwarz",
   title = "Approaches to automatic LEXICON learning with limited training examples",
   pages = "5094--5097",
   booktitle = "Proc. International Conference on Acoustics, Speech, and Signal Processing",
   journal = "Proc. International Conference on Acoustics, Speech, and Signal Processing",
   volume = 2010,
   number = 3,
   year = 2010,
   location = "Dallas, US",
   publisher = "IEEE Signal Processing Society",
   ISBN = "978-1-4244-4296-6",
   ISSN = "1520-6149",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/9309"
}
Back to top