Publication Details

Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer

MOŠNER Ladislav, PLCHOT Oldřich, BURGET Lukáš and ČERNOCKÝ Jan. Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer. In: ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. Singapore: IEEE Signal Processing Society, 2022, pp. 7982-7986. ISBN 978-1-6654-0540-9. Available from: https://ieeexplore.ieee.org/document/9747771
Czech title
Multikanálové ověřování mluvčího se směrováním akustického paprsku založeným na Conv-Tasnet
Type
conference paper
Language
english
Authors
URL
Keywords

Conv-TasNet, beamforming, embedding extractor, speaker verification, MultiSV

Abstract

We focus on the problem of speaker recognition in far-field multichannel data. The main contribution is introducing an alternative way of predicting spatial covariance matrices (SCMs) for a beamformer from the time domain signal. We propose to use ConvTasNet, a well-known source separation model, and we adapt it to perform speech enhancement by forcing it to separate speech and additive noise. We experiment with using the STFT of Conv-TasNet outputs to obtain SCMs of speech and noise, and finally, we fine-tune this multi-channel frontend w.r.t. speaker verification objective. We successfully tackle the problem of the lack of a realistic multichannel training set by using simulated data of MultiSV corpus. The analysis is performed on its retransmitted and simulated test parts. We achieve consistent improvements with a 2.7 times smaller model than the baseline based on a scheme with mask estimating NN.

Published
2022
Pages
7982-7986
Proceedings
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Conference
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), Singapore, SG
ISBN
978-1-6654-0540-9
Publisher
IEEE Signal Processing Society
Place
Singapore, SG
DOI
UT WoS
000864187908058
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB12786,
   author = "Ladislav Mo\v{s}ner and Old\v{r}ich Plchot and Luk\'{a}\v{s} Burget and Jan \v{C}ernock\'{y}",
   title = "Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer",
   pages = "7982--7986",
   booktitle = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
   year = 2022,
   location = "Singapore, SG",
   publisher = "IEEE Signal Processing Society",
   ISBN = "978-1-6654-0540-9",
   doi = "10.1109/ICASSP43922.2022.9747771",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/12786"
}
Back to top