Publication Details

Spelling-Aware Word-Based End-to-End ASR

EGOROVA Ekaterina, VYDANA Hari K., BURGET Lukáš and ČERNOCKÝ Jan. Spelling-Aware Word-Based End-to-End ASR. IEEE Signal Processing Letters, vol. 29, no. 29, 2022, pp. 1729-1733. ISSN 1558-2361. Available from: https://ieeexplore.ieee.org/document/9833231
Czech title
End-to-End systém pro rozpoznávání řeči založený na slovech beroucí v úvahu jejich hláskování
Type
journal article
Language
english
Authors
URL
Keywords

end-to-end, ASR, OOV, Listen Attend and Spell architecture

Abstract

We propose a new end-to-end architecture for automatic speech recognition that expands the listen, attend and spell (LAS) paradigm. While the main word-predicting network is trained to predict words, the secondary, speller network, is optimized to predict word spellings from inner representations of the main network (e.g. word embeddings or context vectors from the attention module). We show that this joint training improves the word error rate of a word-based system and enables solving additional tasks, such as out-of-vocabulary word detection and recovery. The tests are conducted on LibriSpeech dataset consisting of 1000h of read speech.

Published
2022
Pages
1729-1733
Journal
IEEE Signal Processing Letters, vol. 29, no. 29, ISSN 1558-2361
Publisher
IEEE Signal Processing Society
DOI
UT WoS
000842088200001
EID Scopus
BibTeX
@ARTICLE{FITPUB12803,
   author = "Ekaterina Egorova and K. Hari Vydana and Luk\'{a}\v{s} Burget and Jan \v{C}ernock\'{y}",
   title = "Spelling-Aware Word-Based End-to-End ASR",
   pages = "1729--1733",
   journal = "IEEE Signal Processing Letters",
   volume = 29,
   number = 29,
   year = 2022,
   ISSN = "1558-2361",
   doi = "10.1109/LSP.2022.3192199",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/12803"
}
Back to top