Project Details
Nové směry ve výzkumu a využití hlasových technologií
Project Period: 1. 1. 2005 - 31. 12. 2007
Project Type: grant
Code: GA102/05/0278
Agency: Czech Science Foundation
voice technology;automatic speech recognition;multi-lingual systems;speaker recognition and verification;spontaneous speech recognition;accoustic-visual speech processing;automatic transcription;large speech databases;dialogue systems;prosody optimization
The proposed project follows up the previous research activities carried out in the speech processing area by the team that integrates all Czech research groups which are recently active in speech analysis, synthesis and recognition. It was established in 1996 to participate on an ambitious 6-year project supported by the GACR and later continued in another speech oriented project ending in 2002. Each of the groups involved has its own proficiency in a specific domain, which allows the consortium to work on integrated and complex tasks. In the previous years the team has created large databases of annotated speech recordings, which are now available both training and testing purposes in speech recognition domain as well as for speech synthesis. In addition, a set of powerful tools and platforms for developing own recognition and synthesis systems has been built together with several working prototypes that serve for evaluation and demonstration purposes. Based on this state and with respect to the recent trends in voice technologies, the project will focus on the investigation and implementation of algorithms that are applicable in distributed, embedded and mobile systems, in recognition engines working with very large vocabularies, in TTS modules for interactive communication and information services, in automatic transcription of broadcast news as well as in multimodal audio-visual interfaces. Primarily, the research will address specific needs of Czech.
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT) , team leader
Grézl František, Ing., Ph.D. (DCGM FIT BUT) , team leader
Chalupníček Kamil, Ing. (DCGM FIT BUT) , team leader
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT) , team leader
Matějka Pavel, Ing. (UREL FEEC BUT) , team leader
Motlíček Petr, doc. Ing., Ph.D. (DCGM FIT BUT) , team leader
Schwarz Petr, Ing., Ph.D. (DCGM FIT BUT) , team leader
Szőke Igor, Ing., Ph.D. (DCGM FIT BUT) , team leader
- BURGET Lukáš, MATĚJKA Pavel, SCHWARZ Petr, GLEMBEK Ondřej and ČERNOCKÝ Jan. Analysis of feature extraction and channel compensation in GMM speaker recognition system. IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 7, 2007, pp. 1979-1986. ISSN 1558-7916. Detail
- KARAFIÁT Martin, BURGET Lukáš, ČERNOCKÝ Jan and HAIN Thomas. Real-Time ASR from Meetings. In: Proc. INTERSPEECH 2007. Antwerpen: International Speech Communication Association, 2007, p. 4. ISSN 1990-9772. Detail
- MATĚJKA Pavel, BURGET Lukáš, GLEMBEK Ondřej, SCHWARZ Petr, HUBEIKA Valiantsina, FAPŠO Michal, MIKOLOV Tomáš and PLCHOT Oldřich. BUT system description for NIST LRE 2007. In: Proc. 2007 NIST Language Recognition Evaluation Workshop. Orlando: National Institute of Standards and Technology, 2007, pp. 1-5. Detail
- SZŐKE Igor, BURGET Lukáš and KARAFIÁT Martin. Combination of Word and Phoneme Approach for Spoken Term Detection. Brno, 2007. Detail
- BRÜMMER Niko, BURGET Lukáš, ČERNOCKÝ Jan, GLEMBEK Ondřej, GRÉZL František, KARAFIÁT Martin, VAN Leeuwen David, MATĚJKA Pavel, SCHWARZ Petr and STRASHEIM Albert. Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006. IEEE Transactions on Audio, Speech, and Language Processing, vol. 15, no. 7, 2007, pp. 2072-2084. ISSN 1558-7916. Detail
- HUBEIKA Valiantsina, SZŐKE Igor, BURGET Lukáš and ČERNOCKÝ Jan. Maximum Likelihood and Maximum Mutual Information Training in Gender and Age Recognition System. In: Proc. 10th International Conference on Text Speech and Dialogue (TSD 2007). Pilsen: Springer Verlag, 2007, pp. 1-6. ISBN 978-3-540-74627-0. Detail
- GRÉZL František, KARAFIÁT Martin and ČERNOCKÝ Jan. Neural network topologies and bottle neck features in speech recognition. Brno, 2007. Detail
- MIKOLOV Tomáš, OPARIN Ilya, GLEMBEK Ondřej, BURGET Lukáš, KARAFIÁT Martin and ČERNOCKÝ Jan. Použití mluvených korpusů ve vývoji systému pro rozpoznávání českých přednášek. Praha: Charles University, 2007. Detail
- GRÉZL František, KARAFIÁT Martin, KONTÁR Stanislav and ČERNOCKÝ Jan. Probabilistic and bottle-neck features for LVCSR of meetings. In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007). Hononulu: IEEE Signal Processing Society, 2007, pp. 757-760. ISBN 1-4244-0728-1. Detail
- ČERNOCKÝ Jan, SZŐKE Igor, FAPŠO Michal, KARAFIÁT Martin, BURGET Lukáš, KOPECKÝ Jiří, GRÉZL František, SCHWARZ Petr, GLEMBEK Ondřej, OPARIN Ilya, SMRŽ Pavel and MATĚJKA Pavel. Search in speech for public security and defense. In: Proc. IEEE Workshop on Signal Processing Applications for Public Security and Forensics, 2007 (SAFE '07). Washington D.C.: IEEE Signal Processing Society, 2007, pp. 1-7. ISBN 1-4244-1226-9. Detail
- FAPŠO Michal. Search in speech records. In: Proc. 13th Conference STUDENT EEICT 2007. Brno: Faculty of Electrical Engineering and Communication BUT, 2007, pp. 1-3. ISBN 978-80-214-3410-3. Detail
- ČERNOCKÝ Jan, BURGET Lukáš, SCHWARZ Petr, MATĚJKA Pavel, KARAFIÁT Martin, GLEMBEK Ondřej, KOPECKÝ Jiří, SZŐKE Igor, FAPŠO Michal, GRÉZL František, HUBEIKA Valiantsina and OPARIN Ilya. Search in speech, language identification and speaker recognition in Speech@FIT. In: Proc. 17th International Conference Radioelektronika, 2007. Brno: Department of Radioelectronics FEEC BUT, 2007, pp. 1-6. ISBN 978-80-214-3390-8. Detail
- SZŐKE Igor, FAPŠO Michal, KARAFIÁT Martin, BURGET Lukáš, GRÉZL František, SCHWARZ Petr, GLEMBEK Ondřej, MATĚJKA Pavel, KOPECKÝ Jiří and ČERNOCKÝ Jan. Spoken Term Detection System Based on a Combination of LVCSR and Phonetic Search. Brno, 2007. Detail
- MATĚJKA Pavel, BURGET Lukáš, SCHWARZ Petr, GLEMBEK Ondřej, KARAFIÁT Martin, GRÉZL František, ČERNOCKÝ Jan, VAN Leeuwen David, BRÜMMER Niko and STRASHEIM Albert. STBU system for the NIST 2006 speaker recognition evaluation. In: Proc. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2007). Honolulu: IEEE Signal Processing Society, 2007, pp. 221-224. ISBN 1-4244-0728-1. Detail
- GRÉZL František and ČERNOCKÝ Jan. TRAP-based Techniques for Recognition of Noisy Speech. In: Proc. 10th International Conference on Text Speech and Dialogue (TSD 2007). LNCS. Berlin: Springer Verlag, 2007, pp. 270-277. ISBN 978-3-540-74627-0. Detail
- AL-HAMES Marc, HAIN Thomas, ČERNOCKÝ Jan, SCHREIBER Sascha, POEL Mannes, MÜLLER Ronald, MARCEL Sebastien, VAN Leeuwen David, ODOBEZ Jean-Marc, BA Sileye, BOURLARD Herve, CARDINAUX Fabien, GATICA-PEREZ Daniel, JANIN Adam, MOTLÍČEK Petr, REITER Stephan, RENALS Steve, VAN Rest Jeroen, RIENKS Rutger, RIGOLL Gerhard, SMITH Kevin, THEAN Andrew and ZEMČÍK Pavel. Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers. In: Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006). Washington D.C., 2006, p. 12. Detail
- ČERNOCKÝ Jan, MATĚJKA Pavel, BURGET Lukáš and SCHWARZ Petr. Automatic Language Identification System. In: Sborník příspěvků z odborného semináře "Nové technologie v radiokomunikacích". Brno: University of Defence in Brno, 2006, pp. 1-6. Detail
- MATĚJKA Pavel, BURGET Lukáš, SCHWARZ Petr and ČERNOCKÝ Jan. Brno University of Technology System for NIST 2005 Language Recognition Evaluation. In: Proceedings of Odyssey 2006: The Speaker and Language Recognition Workshop. San Juan, 2006, pp. 57-64. ISBN 1-4244-0472-X. Detail
- KOPECKÝ Jiří, SZŐKE Igor, FAPŠO Michal, KARAFIÁT Martin, BURGET Lukáš, OPARIN Ilya, SCHWARZ Petr, MATĚJKA Pavel, ČERNOCKÝ Jan and GLEMBEK Ondřej. BUT System for NIST STD 2006 - Arabic. In: Proc. NIST SPoken Term Detection Evaluation workshop (STD 2006). Washington D.C.: National Institute of Standards and Technology, 2006, p. 15. Detail
- SZŐKE Igor, FAPŠO Michal, KARAFIÁT Martin, BURGET Lukáš, GRÉZL František, SCHWARZ Petr, GLEMBEK Ondřej, MATĚJKA Pavel, KONTÁR Stanislav and ČERNOCKÝ Jan. BUT System for NIST STD 2006 - English. In: Proc. NIST SPoken Term Detection Evaluation workshop (STD 2006). Washington D.C.: National Institute of Standards and Technology, 2006, p. 26. Detail
- GLEMBEK Ondřej, KARAFIÁT Martin, BURGET Lukáš and ČERNOCKÝ Jan. Czech Speech Recognizer for Multiple Environments. In: Radioeletronika 2006. Bratislava, 2006, pp. 1-4. Detail
- BURGET Lukáš, MATĚJKA Pavel and ČERNOCKÝ Jan. Discriminative Training Techniques for Acoustic Language Identification. In: Proceedings of ICASSP 2006. Toulouse, 2006, pp. 209-212. Detail
- HUBEIKA Valiantsina. Estimation of Gender and Age from Recorded Speech. In: Proc. ACM Student Research competition 2006. Prague: Czech Technical University, 2006, pp. 25-32. ISBN 80-01-03595-6. Detail
- SCHWARZ Petr, MATĚJKA Pavel and ČERNOCKÝ Jan. Hierarchical structures of neural networks for phoneme recognition. In: Proceedings of ICASSP 2006. Toulouse, 2006, pp. 325-328. Detail
- BURGET Lukáš, ČERNOCKÝ Jan, FAPŠO Michal, KARAFIÁT Martin, MATĚJKA Pavel, SCHWARZ Petr, SMRŽ Pavel and SZŐKE Igor. Indexing and search methods for spoken documents. In: Proceedings of the Ninth International Conference on Text, Speech and Dialogue, TSD 2006. LNCS. Berlin: Springer Verlag, 2006, pp. 351-358. ISSN 0302-9743. Detail
- FAPŠO Michal, SMRŽ Pavel, SCHWARZ Petr, SZŐKE Igor, SCHWARZ Milan, ČERNOCKÝ Jan, KARAFIÁT Martin and BURGET Lukáš. Information Retrieval from Spoken Documents. In: Proceedings of the Seventh International Conference on Intelligent Text Processing and Computational Linguistics (CICLING 2006). Mexico City: Springer Verlag, 2006, pp. 410-416. ISBN 3-540-32205-1. Detail
- MATĚJKA Pavel, BURGET Lukáš, SCHWARZ Petr and ČERNOCKÝ Jan. NIST 2005 Language Recognition Evaluation. In: Proceedings of NIST LRE 2005. Washington DC: National Institute of Standards and Technology, 2006, pp. 1-37. Detail
- KONTÁR Stanislav. Parallel training of neural networks for speech recognition. In: Proc. 12th International Conference on Soft Computing MENDEL'06. Brno: Brno University of Technology, 2006, p. 6. ISBN 80-214-3195-4. Detail
- KARAFIÁT Martin, GRÉZL František, SCHWARZ Petr, BURGET Lukáš and ČERNOCKÝ Jan. Robust heteroscedastic linear discriminant analysis and LCRC posterior features in large vocabulary continuous speech recognition. In: Proc. Fifth Slovenian and First International Language Technologies Conference. Ljubljana, 2006, pp. 1-4. Detail
- KARAFIÁT Martin, GRÉZL František, SCHWARZ Petr, BURGET Lukáš and ČERNOCKÝ Jan. Robust heteroscedastic linear discriminant analysis and LCRC posterior features in meeting data recognition. In: Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006). Lecture Notes in Computer Science, vol. 4299. Berlin: Springer Verlag, 2006, pp. 275-284. ISBN 3-540-69267-3. Detail
- FAPŠO Michal, SCHWARZ Petr, SZŐKE Igor, SMRŽ Pavel, SCHWARZ Milan, ČERNOCKÝ Jan, KARAFIÁT Martin and BURGET Lukáš. Search Engine for Information Retrieval from Speech Records. In: Proceedings of the Third International Seminar on Computer Treatment of Slavic and East European Languages. Bratislava, 2006, pp. 100-101. Detail
- MATĚJKA Pavel, SCHWARZ Petr, BURGET Lukáš and ČERNOCKÝ Jan. Use of anti-models to furher improve state-of-the-art PRLM language recognition system. In: Proceedings of ICASSP 2006. Toulouse, 2006, pp. 197-200. Detail
- SZŐKE Igor, SCHWARZ Petr, BURGET Lukáš, FAPŠO Michal, KARAFIÁT Martin, ČERNOCKÝ Jan and MATĚJKA Pavel. Comparison of Keyword Spotting Approaches for Informal Continuous Speech. In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisabon, 2005, pp. 633-636. ISSN 1018-4074. Detail
- SZŐKE Igor, SCHWARZ Petr, MATĚJKA Pavel, BURGET Lukáš, FAPŠO Michal, KARAFIÁT Martin and ČERNOCKÝ Jan. Comparison of Keyword Spotting Approaches for Informal Continuous Speech. In: 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms. Edinburgh, 2005, p. 12. Detail
- SUMEC Stanislav and KADLEC Jaroslav. Event Editor - The Multi-Modal Annotation Tool. In: Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI). Edinburgh, 2005, p. 1. Detail
- STOLCKE Andreas, ANGUERA Xavier, BOAKYE Kofi, CETIN Özgür, GRÉZL František, JANIN Adam, MANDAL Arindam, PESKIN Barbara, WOOTERS Chuck and ZHENG Jing. Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System. In: Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers. Lecture Notes in Computer Science 3869, Springer 2006. Edinburgh, Scotland: University of Edinburgh, 2005, pp. 463-475. ISBN 978-3-540-32549-9. Detail
- ZHU Qifeng, CHEN Barry, GRÉZL František and MORGAN Nelson. Improved MLP Structures for Data-Driven Feature Extraction for ASR. In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisabon, 2005, p. 4. ISSN 1018-4074. Detail
- MOTLÍČEK Petr, BURGET Lukáš and ČERNOCKÝ Jan. Non-parametric Speaker Turn Segmentation of Meeting Data. In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisabon: International Speech Communication Association, 2005, pp. 657-660. ISSN 1018-4074. Detail
- SZŐKE Igor, SCHWARZ Petr, BURGET Lukáš, KARAFIÁT Martin, MATĚJKA Pavel and ČERNOCKÝ Jan. Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech. Lecture Notes in Computer Science, vol. 2005, no. 3658, p. 8. ISSN 0302-9743. Detail
- MATĚJKA Pavel. Phoneme Recognition Tuning for Language Identification System. In: Proceedings of the 11th conference STUDENT EEICT 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005, pp. 658-653. ISBN 80-214-2890-2. Detail
- MATĚJKA Pavel, SCHWARZ Petr, ČERNOCKÝ Jan and CHYTIL Pavel. Phonotactic Language Identification. In: Proceedings of Radioelektronika 2005. Brno: Faculty of Electrical Engineering and Communication BUT, 2005, pp. 140-143. ISBN 80-214-2904-6. Detail
- MATĚJKA Pavel, SCHWARZ Petr, ČERNOCKÝ Jan and CHYTIL Pavel. Phonotactic Language Identification using High Quality Phoneme Recognition. In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisbon: International Speech Communication Association, 2005, pp. 2237-2240. ISSN 1018-4074. Detail
- FAPŠO Michal, SCHWARZ Petr, SZŐKE Igor, ČERNOCKÝ Jan, SMRŽ Pavel, BURGET Lukáš and KARAFIÁT Martin. Search Engine for Information Retrieval from Multi-modal Records. Edinburgh, 2005. Detail
- SZŐKE Igor. Smooth Pitch Tracker Based on Harmonic and Noise Model. In: STUDENT EEICT 2005. Brno: Faculty of Information Technology BUT, 2005, pp. 673-677. ISBN 80-214-2890-2. Detail
- GRÉZL František. Spectral plane investigation for probabilistic features for ASR. Edinburgh, 2005. Detail
- FAPŠO Michal, SMRŽ Pavel, SCHWARZ Petr, SZŐKE Igor, BURGET Lukáš, KARAFIÁT Martin and ČERNOCKÝ Jan. Systém pre efektívne vyhľadávanie v rečových databázach. In: Sborník databázové konference DATAKON 2005. Brno: Masaryk University, 2005, pp. 323-333. ISBN 80-210-3813-6. Detail
- HAIN Thomas, BURGET Lukáš, DINES John, GARAU Giulia, KARAFIÁT Martin, LINCOLN Mike, MCCOWAN Iain, MOORE Darren, WAN Vincent, ORDELMAN Roeland and RENALS Steve. The 2005 AMI System for the Transcription of Speech in Meetings. In: Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers. Lecture Notes in Computer Science Volume 3869, Springer 2006. Edinburgh: University of Edinburgh, 2005, pp. 450-462. ISBN 978-3-540-32549-9. Detail
- ASHBY Simone, BOURBAN Sebastien, CARLETTA Jean, FLYNN Mike, GUILLEMOT Mael, HAIN Thomas, KADLEC Jaroslav, KARAISKOS Vasilis, KRAAIJ Wessel, KRONENTHAL Melissa, LATHOUD Guillaume, LINCOLN Mike, LISOWSKA Agnes, MCCOWAN Iain, POST Wilfried, REIDSMA Dennis and WELLNER Pierre. The AMI Meeting Corpus. In: Measuring Behavior 2005 Proceedings Book. Wageningen, 2005, p. 4. Detail
- ASHBY Simone, BOURBAN Sebastien, CARLETTA Jean, FLYNN Mike, GUILLEMOT Mael, HAIN Thomas, KADLEC Jaroslav, KARAISKOS Vasilis, KRAAIJ Wessel, KRONENTHAL Melissa, LATHOUD Guillaume, LINCOLN Mike, LISOWSKA Agnes, MCCOWAN Iain, POST Wilfried, REIDSMA Dennis and WELLNER Pierre. The AMI Meeting Corpus: A Pre-Announcement. In: Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI). Edinburgh, 2005, p. 4. Detail
- HAIN Thomas, KARAFIÁT Martin, DINES John, MCCOWAN Iain, LINCOLN Mike, GARAU Giulia, WAN Vincent, ORDELMAN Roeland and RENALS Steve. The Development of the AMI System for the Transcription of Speech in Meetings. In: Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers. Lecture Notes in Computer Science Volume 3869, Springer 2006. Edinburgh: University of Edinburgh, 2005, pp. 344-356. ISBN 978-3-540-32549-9. Detail
- HAIN Thomas, KARAFIÁT Martin, GARAU Giulia, MOORE Darren, WAN Vincent, ORDELMAN Roeland and RENALS Steve. Transcription of Conference Room Meetings: an Investigation. In: Interspeech'2005 - Eurospeech - 9th European Conference on Speech Communication and Technology. Lisabon: International Speech Communication Association, 2005, p. 4. ISSN 1018-4074. Detail
- MATĚJKA Pavel, SCHWARZ Petr, ČERNOCKÝ Jan and CHYTIL Pavel. Tuning Phonotactic Language Identificaion System. Brno: Faculty of Information Technology BUT, 2005. Detail
- KARAFIÁT Martin, BURGET Lukáš and ČERNOCKÝ Jan. Using Smoothed Heteroscedastic Linear Discriminant Analysis in Large Vocabulary Continuous Speech Recognition System. In: 2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms. tento článek nebyl zařazen mezi Revised Selected Papers, nevyšel v LNCS 3869. Edinbourgh, Scotland: University of Edinburgh, 2005, p. 8. Detail
- Phoneme recognizer based on long temporal context, software, 2008
Authors: Schwarz Petr, Matějka Pavel, Burget Lukáš, Glembek Ondřej Detail
- AMI Large vocabulary continuous speech recognizer, software, 2005
Authors: Burget Lukáš, Hain Thomas, Karafiát Martin Detail - Indexation and search engine for multimodal data, software, 2005
Authors: Černocký Jan, Fapšo Michal, Schwarz Petr, Szőke Igor Detail - STK Toolkit, software, 2005
Authors: Burget Lukáš, Černocký Jan, Glembek Ondřej, Karafiát Martin, Kontár Stanislav, Schwarz Petr Detail - System for automatic language identification (LID), production, 2005
Authors: Burget Lukáš, Černocký Jan, Matějka Pavel, Schwarz Petr Detail - System for on-line keyword spotting, software, 2005
Authors: Černocký Jan, Matějka Pavel, Schwarz Petr, Szőke Igor Detail - Web-based system for semi-automatic checks of speech annotations, software, 2005
Authors: Černocký Jan, Chalupníček Kamil, Kašpárek Tomáš Detail