Project Details
NTT - Speech enhancement front-end for robust automatic speech recognition with large amount of training data
Project Period: 1. 10. 2017 - 30. 9. 2018
Project Type: contract
Partner: NTT Corporation
Czech title
NTT - Parametrizace s obohacováním řeči pro robustní automatické rozpoznávání řeči s velkým objemem trénovacích dat
Type
contract
Keywords
speech recognition, robustness, large data, DNN embeddings
Abstract
The purpose of the Joint Research is to develop Speech enhancement front-end for robust automatic speech recognition with large amount of training data through the cooperation of NTT and BUT. The work is relying on embeddings produced by neural networks in various places of the processing chain.
Team members
Žmolíková Kateřina, Ing., Ph.D.
(UPGM FIT VUT)
, research leader
Černocký Jan, prof. Dr. Ing. (UPGM FIT VUT) , team leader
Černocký Jan, prof. Dr. Ing. (UPGM FIT VUT) , team leader
Publications
2018
- ROHDIN Johan A., SILNOVA Anna, DIEZ Sánchez Mireia, PLCHOT Oldřich, MATĚJKA Pavel and BURGET Lukáš. End-to-End DNN Based Speaker Recognition Inspired by i-Vector and PLDA. In: Proceedings of ICASSP. Calgary: IEEE Signal Processing Society, 2018, pp. 4874-4878. ISBN 978-1-5386-4658-8. Detail
- DELCROIX Marc, ŽMOLÍKOVÁ Kateřina, KINOSHITA Keisuke, OGAWA Atsunori and NAKATANI Tomohiro. Single Channel Target Speaker Extraction and Recognition with Speaker Beam. In: Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018, pp. 5554-5558. ISBN 978-1-5386-4658-8. Detail
- DELCROIX Marc, ŽMOLÍKOVÁ Kateřina, KINOSHITA Keisuke, ARAKI Shoko, OGAWA Atsunori and NAKATANI Tomohiro. SpeakerBeam: A New Deep Learning Technology for Extracting Speech of a Target Speaker Based on the Speaker's Voice Characteristics. NTT Technical Review, vol. 16, no. 11, 2018, pp. 19-24. ISSN 1348-3447. Detail
2017
- ŽMOLÍKOVÁ Kateřina, DELCROIX Marc, KINOSHITA Keisuke, HIGUCHI Takuya, OGAWA Atsunori and NAKATANI Tomohiro. Learning Speaker Representation for Neural Network Based Multichannel Speaker Extraction. In: Proceedings of ASRU 2017. Okinawa: IEEE Signal Processing Society, 2017, pp. 8-15. ISBN 978-1-5090-4788-8. Detail
- ŽMOLÍKOVÁ Kateřina. Summary report of project "Speech enhancement front-end for robust automatic speech recognition with large amount of training data" for Year 2017. Brno: NTT Corporation, 2017. Detail