Result Details

Advances in Acoustic Modeling for the Recognition of Czech

KOPECKÝ, J.; GLEMBEK, O.; KARAFIÁT, M. Advances in Acoustic Modeling for the Recognition of Czech. Proc. 11th International Conference on Text, Speech and Dialogue. Lecture Notes in Computer Science. Berlin: Springer Verlag, 2008. p. 357-363. ISBN: 978-3-540-87390-7.

Type

conference paper

Language

English

Authors

Kopecký Jiří, Ing., FIT (FIT), DCGM (FIT)
Glembek Ondřej, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Karafiát Martin, Ing., Ph.D., FIT (FIT), DCGM (FIT)

Abstract

The paper is on Advances in Acoustic Modeling for the Recognition of Czech

Keywords

Automatic Speech Recognition, LVCSR system, acoustic modeling, HLDA, VTLN, CMLLR, lectures recognition

URL

https://www.fit.vut.cz/research/group/speech/public/publi/2008/kopecky_tsd2008…

Annotation

This paper presents recent advances in Automatic Speech Recognition for the Czech Language. Improvements were achieved both in acoustic and language modeling. We mainly aim on the acoustic part of the issue. The results are presented in two contexts, the lecture recognition and SpeeCon+Temic test set. The paper shows the impact of using advanced modeling techniques such as HLDA, VTLN and CMLLR. On the lecture test set, we show that training acoustic models using word networks together with the pronunciation dictionary gives about 4-5% absolute performance improvement as opposed to using direct phonetic transcriptions. An effect of incorporating the "schwa" phoneme in the training phase shows a slight improvement.

Published

2008

Pages

357–363

Proceedings

Proc. 11th International Conference on Text, Speech and Dialogue

Series

Lecture Notes in Computer Science

Volume

5246

Conference

11th International Conference on Text, Speech and Dialogue

ISBN

978-3-540-87390-7

Publisher

Springer Verlag

Place

Berlin

BibTeX

@inproceedings{BUT29244,
  author="Jiří {Kopecký} and Ondřej {Glembek} and Martin {Karafiát}",
  title="Advances in Acoustic Modeling for the Recognition of Czech",
  booktitle="Proc. 11th International Conference on Text, Speech and Dialogue",
  year="2008",
  series="Lecture Notes in Computer Science",
  volume="5246",
  pages="357--363",
  publisher="Springer Verlag",
  address="Berlin",
  isbn="978-3-540-87390-7",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2008/kopecky_tsd2008.pdf"
}

Projects

Overcoming the language barrier complicating investigation into financing terrorism and serious financial crimes, MV, Program bezpečnostního výzkumu, VD20072010B16, start: 2007-08-01, end: 2010-12-31, completed
Research and development of corpus and speech technologies in new generation of electronic dictionaries, MPO, TANDEM, FT-TA3/006, start: 2006-06-01, end: 2009-12-31, completed

Research groups

Speech Data Mining Research Group BUT Speech@FIT (RG SPEECH)

Departments

Department of Computer Graphics and Multimedia (DCGM)