Course details
Speech Processing Systems
Guarantor
Černocký Jan, prof. Dr. Ing. (DCGM)
Language of instruction
Czech
Completion
Examination
Time span
- 39 hrs lectures
- 13 hrs projects
Department
Study literature
- Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN 80-200-0203-0.
- Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7.
Fundamental literature
- Gussenhoven, J. and Jacobs, H.: Understanding Phonology, Oxford University Press, 1998, ISBN: 0-340-69218-9
- Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN 80-200-0203-0.
- Gold, B., Morgan, N.: Speech and audio signal processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7.
- Moore, B.C.J.: An introduction to the psychology of hearing, Academic Press, 1989, ISBN 0-12-505627-3.
- Jelinek, F.: Statistical Methods for Speech Recognition, MIT Press, 1998, ISBN 0-262-10066-5.
- Manning, C. and Schütze, H.: Foundations of Statistical Natural Language Processing, MIT Press. Cambridge, MA: May 1999.
Syllabus of lectures
- Phonetics and phonology - syllable structure, phonological processes and distinctive features.
- Statistical pattern classification I. - Bayesian framework, Maximum likelihood learning, Gaussian mixture models. Features for GMM modeling.
- Statistical pattern classification II. - Artificial Neural Networks, Support vector machines. Sequence modeling - Hidden Markov models.
- HMM training and adaptation - MLLR, MAP, discriminative training.
- HMM recognition - pronunciation dictionaries and networks, language modeling, decoding, lattices.
- Phoneme recognition. Keyword spotting and search - LVCSR, acoustic and phonetic lattices. Figure of Merit.
- Speaker identification and verification - GMM, SVM. Channel normalization and compensation - feature mapping, eigen-voices and nuissance attributes projection (NAP). Evaluation of speaker verification: DET curves, EER, cost function.
- Language identification - acoustic vs. phonotactic, evaluation.
- Speech coding - CELP framework - adaptive and stochastic codebooks, GSM standards.
- Language modeling 1 - n-gram models, class-based models
- Language modeling 2 - language-specific features, factored-language models
- Psycholinguistics - word recognition models, word associations
- Probabilistic parsing - inside-outside algorithm, dependency parsing
Course inclusion in study plans