Course details
Speech Signal Processing
ZRE Acad. year 2024/2025 Summer semester 5 credits
Applications of speech processing, digital processing of speech signals, production and perception of speech, introduction to phonetics, pre-processing and basic parameters of speech, linear-predictive model, cepstrum, fundamental frequency estimation, coding - time domain and vocoders, recognition - DTW and HMM, synthesis. Software and libraries for speech processing.
Course coordinator
Language of instruction
Time span
- 26 hrs lectures
- 2 hrs exercises
- 12 hrs pc labs
- 12 hrs projects
Assessment points
- 51 pts final exam
- 14 pts mid-term test
- 12 pts labs
- 23 pts projects
Learning objectives
To provide students with the knowledge of basic characteristics of speech signal in relation to production and hearing of speech by humans. To describe basic algorithms of speech analysis common to many applications. To give an overview of applications (recognition, synthesis, coding) and to inform about practical aspects of speech algorithms implementation.
The students will get familiar with basic characteristics of speech signal in relation to production and hearing of speech by humans. They will understand basic algorithms of speech analysis common to many applications. They will be given an overview of applications (recognition, synthesis, coding) and be informed about practical aspects of speech algorithms implementation. The students will be able to design a simple system for speech processing (speech activity detector, recognizer of limited number of isolated words), including its implementation into application programs.
Study literature
- Gold, B., Morgan, N.: Speech and Audio Signal Processing, Wiley-Interscience; 2 edition.
- Yu, D., Deng, L., Automatic speech recognition, Springer, 2016.
- Rabiner, L. R., & Schafer, R. W. Theory and applications of digital speech processing, Pearson, 2011.
- Psutka, J., Müller, L., Matoušek, J., & Radová, V., Mluvíme s počítačem česky, Academia, 2006.
Fundamental literature
- Psutka, J.: Komunikace s počítačem mluvenou řečí. Academia, Praha, 1995, ISBN 80-200-0203-0
Gold, B., Morgan, N.: Speech and Audio Signal Processing, John Wiley & Sons, 2000, ISBN 0-471-35154-7
www stránka předmětu
Syllabus of lectures
- Introduction, applications of speech processing.
- Digital processing of speech signals.
- Speech production and its signal processing model.
- Pre-processing and basic parameters of speech, cepstrum.
- Linear-predictive model.
- Fundamental frequency estimation.
- Speech coding - basics
- CELP Speech coding.
- Speech recognition - basics, DTW.
- Hidden Markov models HMM.
- Large vocabulary continuous speech recognition (LVCSR) systems.
- Speaker and language recognition. Neural networks in speech processing.
- Text to speech synthesis.
Syllabus of numerical exercises
- Parameterization, DTW, HMM.
Syllabus of computer exercises
- Introduction.
- Linear prediction and vector quantization.
- Fundamental frequency estimation and speech coding.
- Basics of classification.
- Recognition - Dynamic time Warping (DTW).
- Recognition - hidden Markov models (HTK).
Progress assessment
- mid-term test 14 pts
- project 29 pts
- presentation of results in computer labs 6 pts
Day | Type | Weeks | Room | Start | End | Capacity | Lect.grp | Groups | Info |
Mon | lecture | 2., 3., 4., 5., 6., 7., 8., 9., 10., 12., 13. of lectures | A112 | 16:00 | 17:50 | 64 | 1MIT 2MIT | NSPE xx | Černocký |
Mon | lecture | 2., 3., 4., 5., 6., 7., 8., 9., 10., 12., 13. of lectures | A112 | 18:00 | 19:50 | 40 | 1MIT | NSPE | Černocký |
Mon | comp.lab | lectures | O204 | 18:00 | 19:50 | 20 | 1MIT 2MIT | NSPE NVER xx | Vendrame |
Wed | exam | 2025-05-21 | D0207 | 14:00 | 15:50 | 1. termín | |||
Thu | exam | 2025-06-05 | E104 | 15:00 | 16:50 | 3. termín | |||
Fri | exam | 2025-05-30 | E104 | 14:00 | 15:50 | 2. termín |
Course inclusion in study plans