An Empirical evaluation of zero resource acoustic unit discovery

Czech title

Empirické hodnocení automatického hledání řečových jednotek bez popsaných trénovacích dat

Type

conference paper

Language

English

Authors

Liu Chunxi
Yang Jinyi
Sun Ming
Kesiraju Santosh, Ph.D. (DCGM)
Rott Alena
Ondel Lucas Antoine Francois, Mgr., Ph.D. (SSDIT)
Ghahremani Pegah
Dehak Najim
Burget Lukáš, doc. Ing., Ph.D. (DCGM)
Khudanpur Sanjeev

URL

http://www.fit.vutbr.cz/research/groups/speech/publi/2017/liu_kesiraju_icassp2017_0005305.pdf PDF

Keywords

Acoustic unit discovery, unsupervised lineardiscriminant analysis, evaluation methods, zero resource

Abstract

Acoustic unit discovery (AUD) is a process of automatically identifying a categorical acoustic unit inventory from speech and producing corresponding acoustic unit tokenizations. AUD provides an important avenue for unsupervised acoustic model training in a zero resource setting where expert-provided linguistic knowledge and transcribed speech are unavailable. Therefore, to further facilitate zero-resource AUD process, in this paper, we demonstrate acoustic feature representations can be significantly improved by (i) performing linear discriminant analysis (LDA) in an unsupervised self-trained fashion, and (ii) leveraging resources of other languages through building a multilingual bottleneck (BN) feature extractor to give effective cross-lingual generalization. Moreover, we perform comprehensive evaluations of AUD efficacy on multiple downstream speech applications, and their correlated performance suggests that AUD evaluations are feasible using different alternative language resources when only a subset of these evaluation resources can be available in typical zero resource applications.

Annotation

Acoustic unit discovery (AUD) is a process of automatically identifying a categorical acoustic unit inventory from speech and producing corresponding acoustic unit tokenizations. AUD provides an important avenue for unsupervised acoustic model training in a zero resource setting where expert-provided linguistic knowledge and transcribed speech are unavailable. Therefore, to further facilitate zero-resource AUD process, in this paper, we demonstrate acoustic feature representations can be significantly improved by (i) performing linear discriminant analysis (LDA) in an unsupervised self-trained fashion, and (ii) leveraging resources of other languages through building a multilingual bottleneck (BN) feature extractor to give effective cross-lingual generalization. Moreover, we perform comprehensive evaluations of AUD efficacy on multiple downstream speech applications, and their correlated performance suggests that AUD evaluations are feasible using different alternative language resources when only a subset of these evaluation resources can be available in typical zero resource applications.

Published

2017

Pages

5305–5309

Proceedings

Proceedings of ICASSP 2017

Conference

2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), New Orleans, USA, US

ISBN

978-1-5090-4117-6

Publisher

IEEE Signal Processing Society

Place

New Orleans

DOI

10.1109/ICASSP.2017.7953169

UT WoS

000414286205093

EID Scopus

2-s2.0-85023756396

BibTeX

@inproceedings{BUT144451,
  author="Chunxi {Liu} and Jinyi {Yang} and Ming {Sun} and Santosh {Kesiraju} and Alena {Rott} and Lucas Antoine Francois {Ondel} and Pegah {Ghahremani} and Najim {Dehak} and Lukáš {Burget} and Sanjeev {Khudanpur}",
  title="An Empirical evaluation of zero resource acoustic unit discovery",
  booktitle="Proceedings of ICASSP 2017",
  year="2017",
  pages="5305--5309",
  publisher="IEEE Signal Processing Society",
  address="New Orleans",
  doi="10.1109/ICASSP.2017.7953169",
  isbn="978-1-5090-4117-6",
  url="https://www.fit.vut.cz/research/publication/11471/"
}

Files

pdf liu_kesiraju_icassp2017_0005305.pdf 255 kB