Publication Details
Variational Inference for Acoustic Unit Discovery
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
Bayesian non-parametric, Variational Bayes, acoustic unit discovery
In this article we proposed to train a nonparametric Bayesian model for automatic units discovery within the Variational Bayes framework. Besides simplifying the training scheme, this approach proves to be fast and yields better solution which makes it more suitable for big databases. However, despite the improvement observed, the model still have difficulties with the diversity of speech and tends to learn a large part of unwanted variability. The HMM model for speech segment is convenient but unrealistic and most likely, stronger model will be needed if one wants to achieve accurate automatic units discovery. We plan to extent the present work by using the VB inference with more complex models, as in13, and to gain leverage of Bayesian language models14 to further improve the accuracy of the discovered units.
Recently, several nonparametric Bayesian models have been proposed to automatically discover acoustic units in unlabeled data. Most of them are trained using various versions of the Gibbs Sampling (GS) method. In this work, we consider Variational Bayes (VB) as alternative inference process. Even though VB yields an approximate solution of the posterior distribution it can be easily parallelized which makes it more suitable for large database. Results show that, notwithstanding VB inference is an order of magnitude faster, it outperforms GS in terms of accuracy.
@INPROCEEDINGS{FITPUB11224, author = "Francois Antoine Lucas Yang Ondel and Luk\'{a}\v{s} Burget and Jan \v{C}ernock\'{y}", title = "Variational Inference for Acoustic Unit Discovery", pages = "80--86", booktitle = "Procedia Computer Science", journal = "Procedia Computer Science", volume = 2016, number = 81, year = 2016, location = "Yogyakarta, ID", publisher = "Elsevier Science", ISSN = "1877-0509", doi = "10.1016/j.procs.2016.04.033", language = "english", url = "https://www.fit.vut.cz/research/publication/11224" }