Publication Details
Probability-Aware Word-Confusion-Network-to-Text Alignment Approach for Intent Classification
Madikeri Srikanth (IDIAP)
Sharma Bidisha (Uniphore)
Khalil Driss (IDIAP)
Kumar Sashi (IDIAP)
Nigmatulina Iuliia (IDIAP)
Motlíček Petr, doc. Ing., Ph.D. (DCGM FIT BUT)
Ganapathiraju Aravind (Uniphore)
Word-Confusion-Networks, Cross-modal Alignment, Knowledge Distillation, Intent Classification
Spoken Language Understanding (SLU) technologies have greatly improved due to the effective pretraining of speech representations. A common requirement of industry-based solutions is the portability to deploy SLU models in voice- assistant devices. Thus, distilling knowledge from large text- based language models has become an attractive solution for achieving good performance and guaranteeing portability. In this paper, we introduce a novel architecture that uses a cross- modal attention mechanism to extract bin-level contextual embeddings from a word-confusion network (WNC) encod- ing such that these can be directly compared and aligned with traditional text-based contextual embeddings. This alignment is achieved using a recently proposed tokenwise constrastive loss function. We validate our architecture's effectiveness by fine-tuning our WCN-based pretrained model to do intent classification (IC) on the well-known SLURP dataset. Ob- tained accuracy on the IC task (81%), depicts a 9.4% relative improvement compared to a recent/equivalent E2E method
@INPROCEEDINGS{FITPUB13376, author = "Esa\'{u} Villatoro-tello and Srikanth Madikeri and Bidisha Sharma and Driss Khalil and Sashi Kumar and Iuliia Nigmatulina and Petr Motl\'{i}\v{c}ek and Aravind Ganapathiraju", title = "Probability-Aware Word-Confusion-Network-to-Text Alignment Approach for Intent Classification", pages = "12617--12621", booktitle = "ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)", year = 2024, location = "Seoul, KR", publisher = "IEEE Signal Processing Society", ISBN = "979-8-3503-4485-1", doi = "10.1109/ICASSP48485.2024.10445934", language = "english", url = "https://www.fit.vut.cz/research/publication/13376" }