Publication Details

Challenging margin-based speaker embedding extractors by using the variational information bottleneck

STAFYLAKIS Themos, SILNOVA Anna, ROHDIN Johan A., PLCHOT Oldřich and BURGET Lukáš. Challenging margin-based speaker embedding extractors by using the variational information bottleneck. In: Proceedings of Interspeech 2024. Kos: International Speech Communication Association, 2024, pp. 3220-3224. ISSN 1990-9772. Available from: https://www.isca-archive.org/interspeech_2024/stafylakis24_interspeech.pdf
Czech title
Extraktory embeddingů řečníků pro náročné okrajové podmínky s variačním informačním bottleneckem
Type
conference paper
Language
english
Authors
Stafylakis Themos (OMILIA)
Silnova Anna, MSc., Ph.D. (DCGM FIT BUT)
Rohdin Johan A., Dr. (DCGM FIT BUT)
Plchot Oldřich, Ing., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
URL
Keywords

speaker recognition, variational information bottleneck

Abstract

Speaker embedding extractors are typically trained using a classification loss over the training speakers. During the last few years, the standard softmax/cross-entropy loss has been replaced by the margin-based losses, yielding significant im- provements in speaker recognition accuracy. Motivated by the fact that the margin merely reduces the logit of the target speaker during training, we consider a probabilistic framework that has a similar effect. The variational information bottle- neck provides a principled mechanism for making deterministic nodes stochastic, resulting in an implicit reduction of the pos- terior of the target speaker. We experiment with a wide range of speaker recognition benchmarks and scoring methods and re- port competitive results to those obtained with the state-of-the- art Additive Angular Margin loss.

Published
2024
Pages
3220-3224
Journal
Proceedings of Interspeech - on-line, vol. 2024, no. 9, ISSN 1990-9772
Proceedings
Proceedings of Interspeech 2024
Conference
Interspeech Conference, Kos, GR
Publisher
International Speech Communication Association
Place
Kos, GR
DOI
BibTeX
@INPROCEEDINGS{FITPUB13319,
   author = "Themos Stafylakis and Anna Silnova and A. Johan Rohdin and Old\v{r}ich Plchot and Luk\'{a}\v{s} Burget",
   title = "Challenging margin-based speaker embedding extractors by using the variational information bottleneck",
   pages = "3220--3224",
   booktitle = "Proceedings of Interspeech 2024",
   journal = "Proceedings of Interspeech - on-line",
   volume = 2024,
   number = 9,
   year = 2024,
   location = "Kos, GR",
   publisher = "International Speech Communication Association",
   ISSN = "1990-9772",
   doi = "10.21437/Interspeech.2024-2058",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/13319"
}
Back to top