Publication Details

Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model

BRUMMER Johan Nikolaas Langenhoven, SILNOVA Anna, BURGET Lukáš and STAFYLAKIS Themos. Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model. In: Proceedings of Odyssey 2018. Les Sables d'Olonne: International Speech Communication Association, 2018, pp. 349-356. ISSN 2312-2846.

Czech title

Gaussovské meta-embeddingy pro efektivní skórování PLDA modelu s těžkým chvostem

Type

conference paper

Language

english

Authors

Brummer Johan Nikolaas Langenhoven, Dr. (Phonexia)
Silnova Anna, MSc., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Stafylakis Themos (OMILIA)

URL

http://www.fit.vutbr.cz/research/groups/speech/publi/2018/brummer_odyssey2018_51.pdf PDF

Keywords

embeddings, machine learning, speaker recognition

Abstract

Embeddings in machine learning are low-dimensional representations of complex input patterns, with the property that simple geometric operations like Euclidean distances and dot products can be used for classification and comparison tasks. We introduce meta-embeddings, which live in more general inner product spaces and which are designed to better propagate uncertainty through the embedding bottleneck. Traditional embeddings are trained to maximize between-class and minimize within-class distances. Meta-embeddings are trained to maximize relevant information throughput. As a proof of concept in speaker recognition, we derive an extractor from the familiar generative Gaussian PLDA model (GPLDA). We show that GPLDA likelihood ratio scores are given by Hilbert space inner products between Gaussian likelihood functions, which we term Gaussian meta-embeddings (GMEs). Meta-embedding extractors can be generatively or discriminatively trained. GMEs extracted by GPLDA have fixed precisions and do not propagate uncertainty. We show that a generalization to heavy-tailed PLDA gives GMEs with variable precisions, which do propagate uncertainty. Experiments on NIST SRE 2010 and 2016 show that the proposed method applied to i-vectors without length normalization is up to 20% more accurate than GPLDA applied to length-normalized i-vectors.

Published

2018

Pages

349-356

Journal

Proceedings of Odyssey: The Speaker and Language Recognition Workshop, vol. 2018, no. 6, ISSN 2312-2846

Proceedings

Proceedings of Odyssey 2018

Conference

Odyssey 2018, Les Sables d'Olonne, France, FR

Publisher

International Speech Communication Association

Place

Les Sables d'Olonne, FR

DOI

10.21437/Odyssey.2018-49

EID Scopus

2-s2.0-85054974266

BibTeX

@INPROCEEDINGS{FITPUB11790,
   author = "Langenhoven Nikolaas Johan Brummer and Anna Silnova and Luk\'{a}\v{s} Burget and Themos Stafylakis",
   title = "Gaussian meta-embeddings for efficient scoring of a heavy-tailed PLDA model",
   pages = "349--356",
   booktitle = "Proceedings of Odyssey 2018",
   journal = "Proceedings of Odyssey: The Speaker and Language Recognition Workshop",
   volume = 2018,
   number = 6,
   year = 2018,
   location = "Les Sables d'Olonne, FR",
   publisher = "International Speech Communication Association",
   ISSN = "2312-2846",
   doi = "10.21437/Odyssey.2018-49",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11790"
}