Publication Details
Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers
Hain Thomas (USF)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
Schreiber Sascha (TUM)
Poel Mannes (UTWENTE)
Müller Ronald (TUM)
Marcel Sebastien (IDIAP)
van Leeuwen David (TNO)
Odobez Jean-Marc (IDIAP)
Ba Sileye (IDIAP)
Bourlard Herve (IDIAP)
Cardinaux Fabien (IDIAP)
Gatica-Perez Daniel (IDIAP)
Janin Adam (ICSI Berkeley)
Motlíček Petr, doc. Ing., Ph.D. (DCGM FIT BUT)
Reiter Stephan (TUM)
Renals Steve (UEDIN)
van Rest Jeroen (TNO)
Rienks Rutger (UTWENTE)
Rigoll Gerhard, Prof. Dr.-Ing. (TUM)
Smith Kevin (IDIAP)
Thean Andrew (TNO)
Zemčík Pavel, prof. Dr. Ing. (DCGM FIT BUT)
speech processing, video processing, multi-modal interaction
The paper is on Audio-Visual Processing in Meetings: it asks Seven Questions and presents Current AMI Answers
The project Augmented Multi-party Interaction (AMI) is concerned with the development of meeting browsers and remote meeting assistants for instrumented meeting rooms - and the required component technologies R and D themes: group dynamics, audio, visual, and multimodal processing, content abstraction, and human-computer interaction. The audio-visual processing workpackage within AMI addresses the automatic recognition from audio, video, and combined audio-video streams, that have been recorded during meetings. In this article we describe the progress that has been made in the first two years of the project. We show how the large problem of audio-visual processing in meetings can be split into seven questions, like "Who is acting during the meeting?". We then show which algorithms and methods have been developed and evaluated for the automatic answering of these questions
@INPROCEEDINGS{FITPUB8237, author = "Marc Al-Hames and Thomas Hain and Jan \v{C}ernock\'{y} and Sascha Schreiber and Mannes Poel and Ronald M{\"{u}}ller and Sebastien Marcel and David Leeuwen van and Jean-Marc Odobez and Sileye Ba and Herve Bourlard and Fabien Cardinaux and Daniel Gatica-Perez and Adam Janin and Petr Motl\'{i}\v{c}ek and Stephan Reiter and Steve Renals and Jeroen Rest van and Rutger Rienks and Gerhard Rigoll and Kevin Smith and Andrew Thean and Pavel Zem\v{c}\'{i}k", title = "Audio-Visual Processing in Meetings: Seven Questions and Current AMI Answers", pages = 12, booktitle = "Proc. 3nd Joint Workshop on Multimodal Interaction and Related Machine Learning Algorithms (MLMI 2006)", year = 2006, location = "Washington D.C., US", language = "english", url = "https://www.fit.vut.cz/research/publication/8237" }