Product Details
SW3 ASR pro akusticky náročná prostředí
Created: 2023
English title
SW3 ASR for demanding acoustic conditions
Type
software
License
not public
Authors
Šmídl Luboš, Ing., Ph.D. (WBU in Pilsen)
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
Švec Jan, Ing., Ph.D. (WBU in Pilsen)
Lehečka Jan, Ing., Ph.D. (WBU in Pilsen)
Mošner Ladislav, Ing. (DCGM FIT BUT)
Brukner Jan, Ing. (DCGM FIT BUT)
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
Švec Jan, Ing., Ph.D. (WBU in Pilsen)
Lehečka Jan, Ing., Ph.D. (WBU in Pilsen)
Mošner Ladislav, Ing. (DCGM FIT BUT)
Brukner Jan, Ing. (DCGM FIT BUT)
Keywords
ASR; speech recognition; docker
Description
An Asian language speech recognition (ASR) system based on modern training approaches. The WAV2VEC model was trained on general recordings and retrained on Vietnamese recordings, further extended by data augmentation for demanding acoustic conditions. This achieved the desired robustness. Part of the result is a model for removing noise from the recording (deNoiser). The result is an application that uses a "Docker" container and can be run from the command line on a standard Linux or Windows distribution.
Projects
Research groups
Speech Data Mining Research Group BUT Speech@FIT (VZ SPEECH)
Departments
Department of Computer Graphics and Multimedia FIT BUT (DCGM FIT BUT)
University of West Bohemia in Pilsen (WBU in Pilsen)
University of West Bohemia in Pilsen (WBU in Pilsen)