Publication Details

Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training

BHATTACHARJEE, M.; MOTLÍČEK, P.; NIGMATULINA, I.; HELMKE, H.; OHNEISER, O.; KLEINERT, M.; EHR, H. Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training. 13th SESAR Innovation Days 2023, SIDS 2023. Seville: SESAR Joint Undertaking, 2023. p. 1-8. ISSN: 0770-1268.
Czech title
Přizpůsobení systémů pro automatické rozpoznávání řeči pro detekci vzácných slov bez nákladného přetrénování modelu
Type
conference paper
Language
English
Authors
BHATTACHARJEE, M.
Motlíček Petr, doc. Ing., Ph.D. (DCGM)
NIGMATULINA, I.
HELMKE, H.
OHNEISER, O.
KLEINERT, M.
EHR, H.
URL
Keywords

Speech Recognition; Model Adaptation; Integration of prior knowledge;
Customization of model, Rare-word integration.

Abstract

Thanks to Alexa, Siri or Google Assistant automatic speech recognition (ASR) has
changed our daily life during the last decade. Prototypic applications in the air
traffic management (ATM) domain are available. Recently pre-filling radar label
entries by ASR support has reached the technology readiness level before
industrialization (TRL6). However, seldom spoken and airspace related words
relevant in the ATM context remain a challenge for sophisticated applications.
Open-source ASR toolkits or large pre-trained models for experts - allowing to
tailor ASR to new domains - can be exploited with a typical constraint on
availability of certain amount of domain specific training data, i.e., typically
transcribed speech for adapting acoustic and/or language models. In general, it
is sufficient for a "universal" ASR engine to reliably recognize a few hundred
words that form the vocabulary of the voice communications between air traffic
controllers and pilots. However, for each airport some hundred dependent words
that are seldom spoken need to be integrated. These challenging word entities
comprise special airline designators and waypoint names like "dexon" or "burok",
which only appear in a specific region. When used, they are highly informative
and thus require high recognition accuracies. Allowing plug and play
customization with a minimum expert manipulation assumes that no additional
training is required, i.e., fine-tuning the universal ASR. This paper presents an
innovative approach to automatically integrate new specific word entities to the
universal ASR system. The recognition rate of these region-specific word entities
with respect to the universal ASR increases by a factor of 6.

Published
2023
Pages
1–8
Proceedings
13th SESAR Innovation Days 2023, SIDS 2023
Volume
2023
Number
11
Conference
13th SESAR Innovation Days, Seville, ES
Publisher
SESAR Joint Undertaking
Place
Seville
DOI
BibTeX
@inproceedings{BUT187995,
  author="BHATTACHARJEE, M. and MOTLÍČEK, P. and NIGMATULINA, I. and HELMKE, H. and OHNEISER, O. and KLEINERT, M. and EHR, H.",
  title="Customization of Automatic Speech Recognition Engines for Rare Word Detection Without Costly Model Re-Training",
  booktitle="13th SESAR Innovation Days 2023, SIDS 2023",
  year="2023",
  volume="2023",
  number="11",
  pages="1--8",
  publisher="SESAR Joint Undertaking",
  address="Seville",
  doi="10.61009/SID.2023.1.10",
  issn="0770-1268",
  url="https://www.sesarju.eu/sites/default/files/documents/sid/2023/Papers/SIDs_2023_paper_18%20final.pdf"
}
Files
Back to top