Result Details

iVector Approach to Phonotactic Language Recognition

SOUFIFAR, M.; KOCKMANN, M.; BURGET, L.; PLCHOT, O.; GLEMBEK, O.; SVENDSEN, T. iVector Approach to Phonotactic Language Recognition. In Proceedings of Interspeech 2011. Proceedings of Interspeech. Florence: International Speech Communication Association, 2011. no. 8, p. 2913-2916. ISBN: 978-1-61839-270-1. ISSN: 1990-9772.

Type

conference paper

Language

English

Authors

Soufifar Mehdi Mohammad, Ing., DCGM (FIT)
Kockmann Marcel, Dipl.-Ing., Ph.D., FIT (FIT)
Burget Lukáš, doc. Ing., Ph.D., DCGM (FIT)
Plchot Oldřich, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Glembek Ondřej, Ing., Ph.D., FIT (FIT), DCGM (FIT)
Svendsen Torbjorn

Abstract

We proposed a novel method to extract the iVectors by meansof subspace multinomial modelling of the n-gram counts. Usingthe proposed subspace model, the huge vector of the n-gramcounts are represented by the low-dimensional iVector whilepreserving the discriminative power of the vector.

Keywords

language recognition, subspace modeling, multinomialdistribution

URL

https://www.fit.vut.cz/research/group/speech/public/publi/2011/soufifar…

Annotation

This paper addresses a novel technique for representation and processing of n-gram counts in phonotactic language recognition (LRE): subspace multinomial modelling represents the vectors of n-gram counts by low dimensional vectors of coordinates in total variability subspace, called iVector. Two techniques for iVector scoring are tested: support vector machines (SVM), and logistic regression (LR). Using standard NIST LRE 2009 task as our evaluation set, the latter scoring approach was shown to outperform phonotactic LRE system based on direct SVM classification of n-gram count vectors. The proposed iVector paradigm also shows comparable results to previously proposed PCA-based phonotactic feature extraction.

Published

2011

Pages

2913–2916

Journal

Proceedings of Interspeech, vol. 2011, no. 8, ISSN 1990-9772

Proceedings

Proceedings of Interspeech 2011

Conference

Interspeech Conference

ISBN

978-1-61839-270-1

Publisher

International Speech Communication Association

Place

Florence

EID Scopus

2-s2.0-84865703431

BibTeX

@inproceedings{BUT76439,
  author="Mehdi Mohammad {Soufifar} and Marcel {Kockmann} and Lukáš {Burget} and Oldřich {Plchot} and Ondřej {Glembek} and Torbjorn {Svendsen}",
  title="iVector Approach to Phonotactic Language Recognition",
  booktitle="Proceedings of Interspeech 2011",
  year="2011",
  journal="Proceedings of Interspeech",
  volume="2011",
  number="8",
  pages="2913--2916",
  publisher="International Speech Communication Association",
  address="Florence",
  isbn="978-1-61839-270-1",
  issn="1990-9772",
  url="http://www.fit.vutbr.cz/research/groups/speech/publi/2011/soufifar_interspeech2011_703.pdf"
}

Projects

DARPA Robust Automatic Transcription of Speech (RATS) - RATS Patrol I, BBN, start: 2010-09-23, end: 2014-06-30, completed
Security-Oriented Research in Information Technology, MŠMT, Institucionální prostředky SR ČR (např. VZ, VC), MSM0021630528, start: 2007-01-01, end: 2013-12-31, running

Research groups

Výzkumná skupina dolování dat z řeči BUT Speech@FIT (RG SPEECH)

Departments

Ústav počítačové grafiky a multimédií (DCGM)