Publication Details

iVector Fusion of Prosodic and Cepstral Features for Speaker Verification

KOCKMANN Marcel, FERRER Luciana, BURGET Lukáš and ČERNOCKÝ Jan. iVector Fusion of Prosodic and Cepstral Features for Speaker Verification. In: Proceedings of Interspeech 2011. Florence: International Speech Communication Association, 2011, pp. 265-268. ISBN 978-1-61839-270-1. ISSN 1990-9772.

Czech title

iVektorová fúze prozodických a cepstrálních příznaků pro ověřování mluvčího

Type

conference paper

Language

english

Authors

Kockmann Marcel, Dipl.-Ing. (DCGM FIT BUT)
Ferrer Luciana (SRI)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)

URL

http://www.fit.vutbr.cz/research/groups/speech/publi/2011/kockmann_interspeech2011_677.pdf PDF

Keywords

speaker verification, prosody, JFA, iVector, SMM, fusion

Abstract

This publication is about the first results on the use of total variability modeling of the mean supervector space for a set of prosodic features. We show that this iVector approach outperforms the standard JFA approach originally proposed for these features. We note that this improvement over JFA is observed only when the iVectors are modeled using the PLDA back end.

Annotation

In this paper we apply the promising iVector extraction technique followed by PLDA modeling to simple prosodic contour features. With this procedure we achieve results comparable to a system that models much more complex prosodic features using our recently proposed SMM-based iVector modeling technique. We then propose a combination of both prosodic iVectors by joint PLDA modeling that leads to significant improvements over individual systems with an EER of 5.4% on NIST SRE 2008 telephone data. Finally, we can combine these two prosodic iVector front ends with a baseline cepstral iVector system to achieve up to 21% relative reduction in new DCF.

Published

2011

Pages

265-268

Journal

Proceedings of Interspeech - on-line, vol. 2011, no. 8, ISSN 1990-9772

Proceedings

Proceedings of Interspeech 2011

Conference

Interspeech Conference, Florence, IT

ISBN

978-1-61839-270-1

Publisher

International Speech Communication Association

Place

Florence, IT

BibTeX

@INPROCEEDINGS{FITPUB9753,
   author = "Marcel Kockmann and Luciana Ferrer and Luk\'{a}\v{s} Burget and Jan \v{C}ernock\'{y}",
   title = "iVector Fusion of Prosodic and Cepstral Features for Speaker Verification",
   pages = "265--268",
   booktitle = "Proceedings of Interspeech 2011",
   journal = "Proceedings of Interspeech - on-line",
   volume = 2011,
   number = 8,
   year = 2011,
   location = "Florence, IT",
   publisher = "International Speech Communication Association",
   ISBN = "978-1-61839-270-1",
   ISSN = "1990-9772",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/9753"
}