Project Details
Multiligvální rozpoznávání a vyhledávání v řeči pro elektronické slovníky
Project Period: 1. 9. 2009 - 31. 8. 2013
Project Type: grant
Code: FR-TI1/034
Agency: Ministry of Industry and Trade of the Czech Republic
Program: TIP
English title
Multilingual recognition and search in speech for electronic dictionaries
Type
grant
Keywords
multilinguality, speech recognition, keyword spotting, electronic dictionaries
Abstract
The proposed project aims at research, development and assessment of technologies for prototyping of speech recognition and search systems with only a few hours of transcribed training data, without the need for phonetic or linguistic expertise. These technologies will be tested in the domain of electronic dictionaries.
Team members
Černocký Jan, prof. Dr. Ing.
(UPGM FIT VUT)
, research leader
Burget Lukáš, doc. Ing., Ph.D. (UPGM FIT VUT) , team leader
Grézl František, Ing., Ph.D. (UPGM FIT VUT) , team leader
Karafiát Martin, Ing., Ph.D. (UPGM FIT VUT) , team leader
Matějka Pavel, Ing., Ph.D. (UPGM FIT VUT) , team leader
Schwarz Petr, Ing., Ph.D. (UPGM FIT VUT) , team leader
Žižka Josef, Ing. (UPGM FIT VUT) , team leader
Kubalík Jakub, Ing. (FIT VUT)
Tomášek Pavel, Ing. (FIT VUT)
Veselý Karel, Ing. (FIT VUT)
Burget Lukáš, doc. Ing., Ph.D. (UPGM FIT VUT) , team leader
Grézl František, Ing., Ph.D. (UPGM FIT VUT) , team leader
Karafiát Martin, Ing., Ph.D. (UPGM FIT VUT) , team leader
Matějka Pavel, Ing., Ph.D. (UPGM FIT VUT) , team leader
Schwarz Petr, Ing., Ph.D. (UPGM FIT VUT) , team leader
Žižka Josef, Ing. (UPGM FIT VUT) , team leader
Kubalík Jakub, Ing. (FIT VUT)
Tomášek Pavel, Ing. (FIT VUT)
Veselý Karel, Ing. (FIT VUT)
Publications
2013
- JANDA Miloš. Automatic Generation Of Pronunciation Dictionaries Based On Diarization. In: Proceedings of the 19th Conference Student EEICT 2013. Brno: Brno University of Technology, 2013, pp. 228-232. ISBN 978-80-214-4695-3. Detail
- EGOROVA Ekaterina, VESELÝ Karel, KARAFIÁT Martin, JANDA Miloš and ČERNOCKÝ Jan. Manual and Semi-Automatic Approaches to Building a Multilingual Phoneme Set. In: Proceedings of ICASSP 2013. Vancouver: IEEE Signal Processing Society, 2013, pp. 7324-7328. ISBN 978-1-4799-0355-9. Detail
- SOUFIFAR Mehdi Mohammad, BURGET Lukáš, PLCHOT Oldřich, CUMANI Sandro and ČERNOCKÝ Jan. Regularized Subspace n-Gram Model for Phonotactic iVector Extraction. In: Proceedings of Interspeech 2013. Lyon: International Speech Communication Association, 2013, pp. 74-78. ISBN 978-1-62993-443-3. ISSN 2308-457X. Detail
2012
- SZŐKE Igor, FAPŠO Michal and VESELÝ Karel. BUT2012 Approaches for Spoken Web Search - MediaEval 2012. In: Working Notes Proceedings of the MediaEval 2012 Workshop. Pisa: CEUR-WS.org, 2012, pp. 1-2. ISSN 1613-0073. Detail
- TEJEDOR Javier, FAPŠO Michal, SZŐKE Igor, ČERNOCKÝ Jan and GRÉZL František. Comparison of methods for language-dependent and language-independent query-by-example spoken term detection. ACM Transactions on Information Systems (TOIS), vol. 2012, no. 30, pp. 1-34. ISSN 1046-8188. Detail
- JANDA Miloš, KARAFIÁT Martin and ČERNOCKÝ Jan. Dealing with Numbers in Grapheme-Based Speech Recognition. In: Proceedings of 15th International Conference on Text, Speech and Dialogue. Lecture Notes in Computer Science, 2012, Volume 7499, vol. 2012. Springer-Verlag Berlin Heidelberg 2012: Springer Verlag, 2012, pp. 438-445. ISBN 978-3-642-32789-6. ISSN 0302-9743. Detail
- BRUMMER Johan Nikolaas Langenhoven, CUMANI Sandro, GLEMBEK Ondřej, KARAFIÁT Martin, MATĚJKA Pavel, PEŠÁN Jan, PLCHOT Oldřich, SOUFIFAR Mehdi Mohammad, DE Villiers Edward and ČERNOCKÝ Jan. Description and analysis of the Brno276 system for LRE2011. In: Proceedings of Odyssey 2012: The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012, pp. 216-223. ISBN 978-981-07-3093-2. Detail
- JANDA Miloš. Grapheme Based Speech Recognition. In: Proceedings of the 18th Conference STUDENT EEICT 2012. Brno: Brno University of Technology, 2012, pp. 441-445. ISBN 978-80-214-4460-7. Detail
- KOMBRINK Stefan, MIKOLOV Tomáš, KARAFIÁT Martin and BURGET Lukáš. Improving Language Models for ASR Using Translated In-domain Data. In: Proceedings of 2012 IEEE International Conference on Acoustics, Speech and Signal Processing. Kyoto: IEEE Signal Processing Society, 2012, pp. 4405-4408. ISBN 978-1-4673-0044-5. Detail
- KARAFIÁT Martin, JANDA Miloš, ČERNOCKÝ Jan and BURGET Lukáš. Region Dependent Linear Transforms in Multilingual Speech Recognition. In: Proc. International Conference on Acoustics, Speech, and Signal Processing 2012. Kyoto: IEEE Signal Processing Society, 2012, pp. 4885-4888. ISBN 978-1-4673-0044-5. Detail
- PLCHOT Oldřich, KARAFIÁT Martin, BRUMMER Johan Nikolaas Langenhoven, GLEMBEK Ondřej, MATĚJKA Pavel, DE Villiers Edward and ČERNOCKÝ Jan. Speaker vectors from Subspace Gaussian Mixture Model as complementary features for Language Identification. In: Proceedings of Odyssey 2012, The Speaker and Language Recognition Workshop. Singapur: International Speech Communication Association, 2012, pp. 330-333. ISBN 978-981-07-3093-2. Detail
- VESELÝ Karel, KARAFIÁT Martin, GRÉZL František, JANDA Miloš and EGOROVA Ekaterina. The Language-Independent Bottleneck Features. In: Proceedings of IEEE 2012 Workshop on Spoken Language Technology. Miami: IEEE Signal Processing Society, 2012, pp. 336-341. ISBN 978-1-4673-5124-9. Detail
2011
- POVEY Daniel, KARAFIÁT Martin, GHOSHAL Arnab and SCHWARZ Petr. A Symmetrization of the Subspace Gaussian Mixture Model. In: Proceedings of 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing. Praha: IEEE Signal Processing Society, 2011, pp. 4504-4507. ISBN 978-1-4577-0537-3. Detail
- VESELÝ Karel, KARAFIÁT Martin and GRÉZL František. Convolutive Bottleneck Network Features for LVCSR. In: Proceedings of ASRU 2011. Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 42-47. ISBN 978-1-4673-0366-8. Detail
- KARAFIÁT Martin, BURGET Lukáš, MATĚJKA Pavel, GLEMBEK Ondřej and ČERNOCKÝ Jan. iVector-Based Discriminative Adaptation for Automatic Speech Recognition. In: Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 152-157. ISBN 978-1-4673-0366-8. Detail
- MIKOLOV Tomáš, KOMBRINK Stefan, DEORAS Anoop, BURGET Lukáš and ČERNOCKÝ Jan. RNNLM - Recurrent Neural Network Language Modeling Toolkit. In: Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 1-4. ISBN 978-1-4673-0366-8. Detail
- MIKOLOV Tomáš, DEORAS Anoop, POVEY Daniel, BURGET Lukáš and ČERNOCKÝ Jan. Strategies for Training Large Scale Neural Network Language Models. In: Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 196-201. ISBN 978-1-4673-0366-8. Detail
- GRÉZL František, KARAFIÁT Martin and JANDA Miloš. Study of Probabilistic and Bottle-Neck Features in Multilingual Environment. In: Proceedings of ASRU 2011. Hilton Waikoloa Village, Big Island, Hawaii: IEEE Signal Processing Society, 2011, pp. 359-364. ISBN 978-1-4673-0366-8. Detail
- POVEY Daniel, GHOSHAL Arnab, BOULIANNE Gilles, BURGET Lukáš, GLEMBEK Ondřej, GOEL Nagendra K., HANNEMANN Mirko, MOTLÍČEK Petr, QIAN Yanmin, SCHWARZ Petr, SILOVSKÝ Jan, STEMMER Georg and VESELÝ Karel. The Kaldi Speech Recognition Toolkit. In: Proceedings of ASRU 2011. Hilton Waikoloa Village Resort, Hawaii: IEEE Signal Processing Society, 2011, pp. 1-4. ISBN 978-1-4673-0366-8. Detail
- POVEY Daniel, BURGET Lukáš, AGARWAL Mohit, AKYAZI Pinar, GHOSHAL Arnab, GLEMBEK Ondřej, GOEL Nagendra K., KARAFIÁT Martin, RASTROW Ariya, ROSE Richard, SCHWARZ Petr and THOMAS Samuel et al. The subspace Gaussian mixture model-A structured model for speech recognition. Computer Speech and Language, vol. 25, no. 2, 2011, pp. 404-439. ISSN 0885-2308. Detail
2010
- GHOSHAL Arnab, POVEY Daniel, AGARWAL Mohit, AKYAZI Pinar, BURGET Lukáš, FENG Kai, GLEMBEK Ondřej, GOEL Nagendra K., KARAFIÁT Martin, RASTROW Ariya, ROSE Richard, SCHWARZ Petr and THOMAS Samuel. A novel estimation of feature-space MLLR for full_covariance models. In: Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 4310-4313. ISBN 978-1-4244-4296-6. ISSN 1520-6149. Detail
- GOEL Nagendra K., THOMAS Samuel, AGARWAL Mohit, AKYAZI Pinar, BURGET Lukáš, FENG Kai, GHOSHAL Arnab, GLEMBEK Ondřej, KARAFIÁT Martin, POVEY Daniel, RASTROW Ariya, ROSE Richard and SCHWARZ Petr. Approaches to automatic lexicon learning with limited training examples. In: Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 5094-5097. ISBN 978-1-4244-4296-6. ISSN 1520-6149. Detail
- BURGET Lukáš, SCHWARZ Petr, AGARWAL Mohit, AKYAZI Pinar, FENG Kai, GHOSHAL Arnab, GLEMBEK Ondřej, GOEL Nagendra K., KARAFIÁT Martin, POVEY Daniel, RASTROW Ariya, ROSE Richard and THOMAS Samuel. Multilingual acoustic modeling for speech recognition based on Subspace Gaussian Mixture Models. In: Proc. International Conference on Acoustictics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 4334-4337. ISBN 978-1-4244-4296-6. ISSN 1520-6149. Detail
- POVEY Daniel, BURGET Lukáš, AGARWAL Mohit, AKYAZI Pinar, FENG Kai, GHOSHAL Arnab, GLEMBEK Ondřej, GOEL Nagendra K., KARAFIÁT Martin, RASTROW Ariya, ROSE Richard, SCHWARZ Petr and THOMAS Samuel. Subspace Gaussian mixture models for speech recognition. In: Proc. International Conference on Acoustics, Speech, and Signal Processing. Dallas: IEEE Signal Processing Society, 2010, pp. 4330-4333. ISBN 978-1-4244-4296-6. ISSN 1520-6149. Detail
Products
2013
- Multilingual models for speech recognition, software, 2013
Authors: Karafiát Martin, Grézl František, Egorova Ekaterina, Janda Miloš, Černocký Jan Detail - Prototyping of speech recognizers for new languages, technology, 2013
Authors: Karafiát Martin, Grézl František, Egorova Ekaterina, Janda Miloš, Černocký Jan, Kašpar Michal Detail