Publication Details

Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units

YUSUF Bolaji, ČERNOCKÝ Jan and SARAÇLAR Murat. Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units. In: Proceedings of Interspeech 2024. Kos: International Speech Communication Association, 2024, pp. 5068-5072. ISSN 1990-9772. Available from: https://www.isca-archive.org/interspeech_2024/yusuf24b_interspeech.pdf
Czech title
Předtrénování celostního vyhledávání klíčových slov s automaticky určenými akustickými jednotkami
Type
conference paper
Language
english
Authors
Yusuf Bolaji (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
Saraçlar Murat (UBOGAZ)
URL
Keywords

keyword search, spoken term detection, acoustic unit discovery

Abstract

End-to-end (E2E) keyword search (KWS) has emerged as an alternative and complimentary approach to conventional key- word search which depends on the output of automatic speech recognition (ASR) systems. While E2E methods greatly sim- plify the KWS pipeline, they generally have worse performance than their ASR-based counterparts, which can benefit from pretraining with untranscribed data. In this work, we propose a method for pretraining E2E KWS systems with untranscribed data, which involves using acoustic unit discovery (AUD) to obtain discrete units for untranscribed data and then learning to locate sequences of such units in the speech. We conduct exper- iments across languages and AUD systems: we show that finetuning such a model significantly outperforms a model trained from scratch, and the performance improvements are generally correlated with the quality of the AUD system used for pretraining.

Published
2024
Pages
5068-5072
Journal
Proceedings of Interspeech - on-line, vol. 2024, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2024
Conference
Interspeech Conference, Kos, GR
Publisher
International Speech Communication Association
Place
Kos, GR
DOI
BibTeX
@INPROCEEDINGS{FITPUB13320,
   author = "Bolaji Yusuf and Jan \v{C}ernock\'{y} and Murat Sara\c{c}lar",
   title = "Pretraining End-to-End Keyword Search with Automatically Discovered Acoustic Units",
   pages = "5068--5072",
   booktitle = "Proceedings of Interspeech 2024",
   journal = "Proceedings of Interspeech - on-line",
   volume = 2024,
   number = 9,
   year = 2024,
   location = "Kos, GR",
   publisher = "International Speech Communication Association",
   ISSN = "1990-9772",
   doi = "10.21437/Interspeech.2024-1713",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/13320"
}
Back to top