Publication Details

Diacorrect: Error Correction Back-End for Speaker Diarization

HAN Jiangyu, LANDINI Federico Nicolás, ROHDIN Johan A., DIEZ Sánchez Mireia, BURGET Lukáš, CAO Yuhang, LU Heng and ČERNOCKÝ Jan. Diacorrect: Error Correction Back-End for Speaker Diarization. In: ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Seoul: IEEE Signal Processing Society, 2024, pp. 11181-11185. ISBN 979-8-3503-4485-1. Available from: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10446968
Czech title
Diacorrect: Back-End pro opravu chyb diarizace řečníka
Type
conference paper
Language
english
Authors
Han Jiangyu, M.Eng. (DCGM FIT BUT)
Landini Federico Nicolás (DCGM FIT BUT)
Rohdin Johan A., Dr. (DCGM FIT BUT)
Diez Sánchez Mireia, M.Sc., Ph.D. (DCGM FIT BUT)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Cao Yuhang (Ximalaya)
Lu Heng (Ximalaya)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
URL
Keywords

Speaker diarization, error correction, conversational telephone speech

Abstract

In this work, we propose an error correction framework, named DiaCorrect, to refine the output of a diarization system in a simple yet effective way. This method is inspired by error correction techniques in automatic speech recognition. Our model consists of two parallel convolutional encoders and a transformerbased decoder. By exploiting the interactions between the input recording and the initial system's outputs, DiaCorrect can automatically correct the initial speaker activities to minimize the diarization errors. Experiments on 2-speaker telephony data show that the proposed DiaCorrect can effectively improve the initial model's results. Our source code is publicly available at https://github.com/BUTSpeechFIT/diacorrect.

Published
2024
Pages
11181-11185
Proceedings
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Conference
2024 IEEE International Conference on Acoustics, Speech and Signal Processing IEEE, Seoul, KR
ISBN
979-8-3503-4485-1
Publisher
IEEE Signal Processing Society
Place
Seoul, KR
DOI
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB13268,
   author = "Jiangyu Han and Nicol\'{a}s Federico Landini and A. Johan Rohdin and Mireia S\'{a}nchez Diez and Luk\'{a}\v{s} Burget and Yuhang Cao and Heng Lu and Jan \v{C}ernock\'{y}",
   title = "Diacorrect: Error Correction Back-End for Speaker Diarization",
   pages = "11181--11185",
   booktitle = "ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)",
   year = 2024,
   location = "Seoul, KR",
   publisher = "IEEE Signal Processing Society",
   ISBN = "979-8-3503-4485-1",
   doi = "10.1109/ICASSP48485.2024.10446968",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/13268"
}
Back to top