Publication Details

The 2005 AMI System for the Transcription of Speech in Meetings

HAIN Thomas, BURGET Lukáš, DINES John, GARAU Giulia, KARAFIÁT Martin, LINCOLN Mike, MCCOWAN Iain, MOORE Darren, WAN Vincent, ORDELMAN Roeland and RENALS Steve. The 2005 AMI System for the Transcription of Speech in Meetings. In: Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers. Lecture Notes in Computer Science Volume 3869, Springer 2006. Edinburgh: University of Edinburgh, 2005, pp. 450-462. ISBN 978-3-540-32549-9.
Czech title
2005 AMI systém pro přepis řečových meetingů
Type
conference paper
Language
english
Authors
Hain Thomas (USF)
Burget Lukáš, doc. Ing., Ph.D. (DCGM FIT BUT)
Dines John (IDIAP)
Garau Giulia (UEDIN)
Karafiát Martin, Ing., Ph.D. (DCGM FIT BUT)
Lincoln Mike (IDIAP)
McCowan Iain (IDIAP)
Moore Darren (IDIAP)
Wan Vincent (USF)
Ordelman Roeland (UTWENTE)
Renals Steve (UEDIN)
URL
Keywords

NIST, speech recognition, AMI system

Abstract

In this paper we describe the 2005 AMI system for the transcription of speech in meetings used for participation in the 2005 NIST RT evaluations.

Annotation

In this paper we describe the 2005 AMI system for the transcription of speech in meetings used for participation in the 2005 NIST RT evaluations. The system was designed for participation in the speech to text part of the evaluations, in particular for transcription of speech recorded with multiple distant microphones and independent headset microphones. System performance was tested on both conference room and lecture style meetings. Although input sources are processed using different front-ends, the recognition process is based on a unified system architecture. The system operates in multiple passes and makes use of state of the art technologies such as discriminative training, vocal tract length normalisation, heteroscedastic linear discriminant analysis,speaker adaptation with maximum likelihood linear regression and minimum word error rate decoding. In this paper we describe the system performance on the official development and test sets for the NIST RT05s
evaluations. The system was jointly developed in less than 10 months by a multi-site team and was shown to achieve very competitive performance

Published
2005
Pages
450-462
Proceedings
Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers
Series
Lecture Notes in Computer Science Volume 3869, Springer 2006
Conference
2nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms, Edinburgh, GB
ISBN
978-3-540-32549-9
Publisher
University of Edinburgh
Place
Edinburgh, GB
BibTeX
@INPROCEEDINGS{FITPUB7932,
   author = "Thomas Hain and Luk\'{a}\v{s} Burget and John Dines and Giulia Garau and Martin Karafi\'{a}t and Mike Lincoln and Iain McCowan and Darren Moore and Vincent Wan and Roeland Ordelman and Steve Renals",
   title = "The 2005 AMI System for the Transcription of Speech in Meetings",
   pages = "450--462",
   booktitle = "Machine Learning for Multimodal Interaction, Second International Workshop, MLMI 2005, Edinburgh, UK, July 11-13, 2005, Revised Selected Papers",
   series = "Lecture Notes in Computer Science Volume 3869, Springer 2006",
   year = 2005,
   location = "Edinburgh, GB",
   publisher = "University of Edinburgh",
   ISBN = "978-3-540-32549-9",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/7932"
}
Back to top