Publication Details
Topic identification of spoken documents using unsupervised acoustic unit discovery
Pappagari Raghavendra (IIIT)
Ondel Yang Lucas Antoine Francois, Mgr., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Dehak Najim (JHU)
Khudanpur Sanjeev (JHU)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
Gangashetty Suryakanth V (IIIT)
topic identification, acoustic unit discovery, unsupervised learning, non-parametric Bayesian models
This paper investigates the application of unsupervised acoustic unit discovery for topic identification (topic ID) of spoken audio documents. The acoustic unit discovery method is based on a nonparametric Bayesian phone-loop model that segments a speech utterance into phone-like categories. The discovered phone-like (acoustic) units are further fed into the conventional topic ID framework. Using multilingual bottleneck features for the acoustic unit discovery, we show that the proposed method outperforms other systems that are based on cross-lingual phoneme recognizer.
This paper investigates the application of unsupervised acoustic unit discovery for topic identification (topic ID) of spoken audio documents. The acoustic unit discovery method is based on a nonparametric Bayesian phone-loop model that segments a speech utterance into phone-like categories. The discovered phone-like (acoustic) units are further fed into the conventional topic ID framework. Using multilingual bottleneck features for the acoustic unit discovery, we show that the proposed method outperforms other systems that are based on cross-lingual phoneme recognizer.
@INPROCEEDINGS{FITPUB11470, author = "Santosh Kesiraju and Raghavendra Pappagari and Francois Antoine Lucas Yang Ondel and Luk\'{a}\v{s} Burget and Najim Dehak and Sanjeev Khudanpur and Jan \v{C}ernock\'{y} and V Suryakanth Gangashetty", title = "Topic identification of spoken documents using unsupervised acoustic unit discovery", pages = "5745--5749", booktitle = "Proceedings of ICASSP 2017", year = 2017, location = "New Orleans, US", publisher = "IEEE Signal Processing Society", ISBN = "978-1-5090-4117-6", doi = "10.1109/ICASSP.2017.7953257", language = "english", url = "https://www.fit.vut.cz/research/publication/11470" }