Speech Processing Systems

Guarantor

Černocký Jan, prof. Dr. Ing. (DCGM)

Language of instruction

Czech

Completion

Examination

Time span

Department

Department of Computer Graphics and Multimedia (UPGM)

Study literature

Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN 80-200-0203-0.
Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7.

Fundamental literature

Gussenhoven, J. and Jacobs, H.: Understanding Phonology, Oxford University Press, 1998, ISBN: 0-340-69218-9
Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN 80-200-0203-0.
Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7.
Moore, B.C.J.: An introduction to the psychology of hearing, Academic Press, 1989, ISBN 0-12-505627-3.
Jelinek, F.: Statistical Methods for Speech Recognition, MIT Press, 1998, ISBN 0-262-10066-5.
Manning, C. and Schütze, H.: Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999.

Syllabus of lectures

Phonetics and phonology - syllable structure, phonological processes and distinctive features.
Statistical pattern classification I. - Bayesian framework, Maximum likelihood learning, Gaussian mixture models. Features for GMM modeling.
Statistical pattern classification II. - Artificial Neural Networks, Support vector machines. Sequence modeling - Hidden Markov models.
HMM training and adaptation - MLLR, MAP, discriminative training.
HMM recognition - pronunciation dictionaries and networks, language modeling, decoding, lattices.
Phoneme recognition. Keyword spotting and search - LVCSR, acoustic and phonetic lattices. Figure of Merit.
Speaker identification and verification - GMM, SVM. Channel normalization and compensation - feature mapping, eigen-voices and nuissance attributes projection (NAP). Evaluation of speaker verification: DET curves, EER, cost function.
Language identification - acoustic vs. phonotactic, evaluation.
Speech coding - CELP framework - adaptive and stochastic codebooks, GSM standards.
Language modeling 1 - n-gram models, class-based models
Language modeling 2 - language-specific features, factored-language models
Psycholinguistics - word recognition models, word associations
Probabilistic parsing - inside-outside algorithm, dependency parsing

Course inclusion in study plans

Programme IT-MGR-2, field MBI, MBS, MIS, MMI, MMM, MPS, MPV, MSK, any year of study, Elective
Programme IT-MGR-2, field MGM, 2nd year of study, Elective
Programme IT-MGR-2, field MIN, any year of study, Compulsory-Elective