Publication Details

Speech and Language Recognition with Low-rank Adaptation of Pretrained Models

PRASAD Amrutha, MADIKERI Srikanth, KHALIL Driss, MOTLÍČEK Petr and SCHUEPBACH Christof. Speech and Language Recognition with Low-rank Adaptation of Pretrained Models. In: Proceedings of Interspeech. Kos Island: International Speech Communication Association, 2024, pp. 2825-2829. ISSN 1990-9772. Available from: https://www.isca-archive.org/interspeech_2024/prasad24_interspeech.html
Czech title
Rozpoznávání řeči a jazyka s Low-rank adaptací předtrénovaných modelů
Type
conference paper
Language
english
Authors
Prasad Amrutha (DCGM FIT BUT)
Madikeri Srikanth (IDIAP)
Khalil Driss (IDIAP)
Motlíček Petr, doc. Ing., Ph.D. (DCGM FIT BUT)
Schuepbach Christof (armasuise)
URL
Keywords

parameter reduction, language identification, speech recognition, wav2vec2.0

Abstract

Finetuning large pretrained models demands considerable computational resources, posing practical constraints. Major- ity of the total number of parameters in these models are used by fully connected layers. In this work, we consider applying a semi-orthogonal constraint, followed by full finetuning to the fully connected layers reduces model parameters significantly without sacrificing efficacy in downstream tasks. Specifically, we consider wav2vec2.0 XLS-R and Whisper models for Auto- matic Speech Recognition and Language Recognition. Our re- sults show that we can reduce the model size by approximately 24% during both training and inference time with 0.7% absolute drop in performance for XLS-R and no drop in performance for Whisper for ASR. In combination with performance-efficient training with low-rank adapters, the resource requirements for training can be further reduced by up to 90%.

Published
2024
Pages
2825-2829
Journal
Proceedings of Interspeech - on-line, vol. 2024, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech
Conference
Interspeech Conference, Kos, GR
Publisher
International Speech Communication Association
Place
Kos Island, GR
DOI
BibTeX
@INPROCEEDINGS{FITPUB13296,
   author = "Amrutha Prasad and Srikanth Madikeri and Driss Khalil and Petr Motl\'{i}\v{c}ek and Christof Schuepbach",
   title = "Speech and Language Recognition with Low-rank Adaptation of Pretrained Models",
   pages = "2825--2829",
   booktitle = "Proceedings of Interspeech",
   journal = "Proceedings of Interspeech - on-line",
   volume = 2024,
   number = 9,
   year = 2024,
   location = "Kos Island, GR",
   publisher = "International Speech Communication Association",
   ISSN = "1990-9772",
   doi = "10.21437/Interspeech.2024-2187",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/13296"
}
Back to top