Publication Details
Dealing with Numbers in Grapheme-Based Speech Recognition
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
LVCSR, ASR, grapheme, phoneme, speech recognition.
Grapheme-based speech recognition approach is suitable in situation of low resource languages, where obtaining of pronunciation dictionary is time- and cost-consuming. The paper describes the process of automatic generation of pronunciation dictionaries with emphasis on the expansion of numbers and presents results on GlobalPhone database.
This article presents the results of grapheme-based speech recognition for eight languages. The need for this approach arises in situation of low resource languages, where obtaining a pronunciation dictionary is time- and cost-consuming or impossible. In such scenarios, usage of grapheme dictionaries is the most simplest and straight-forward. The paper describes the process of automatic generation of pronunciation dictionaries with emphasis on the expansion of numbers. Experiments on GlobalPhone database show that grapheme-based systems have results comparable to the phoneme-based ones, especially for phonetic languages.
@INPROCEEDINGS{FITPUB10129, author = "Milo\v{s} Janda and Martin Karafi\'{a}t and Jan \v{C}ernock\'{y}", title = "Dealing with Numbers in Grapheme-Based Speech Recognition", pages = "438--445", booktitle = "Proceedings of 15th International Conference on Text, Speech and Dialogue", series = "Lecture Notes in Computer Science, 2012, Volume 7499", journal = "Lecture Notes in Computer Science", volume = 2012, number = 9, year = 2012, location = "Springer-Verlag Berlin Heidelberg 2012, DE", publisher = "Springer Verlag", ISBN = "978-3-642-32789-6", ISSN = "0302-9743", doi = "10.1007/978-3-642-32790-2\_53", language = "english", url = "https://www.fit.vut.cz/research/publication/10129" }