Publication Details

Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the 'Speaking Rosetta' JSALT 2017 Workshop

SCHARENBORG Odette, BESACIER Laurent, BLACK Alan, HASEGAWA-JOHNSON Mark, METZE Florian, NEUBIG Graham, STÜKER Sebastian, GODARD Pierre, MÜLLER Markus, ONDEL Yang Lucas Antoine Francois, PALASKAR Shruti, ARTHUR Philip, CIANNELLA Francesco, DU Mingxing, LARSEN Elin, MERKX Danny, RIAD Rachid, WANG Liming and DUPOUX Emmanuel. Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the 'Speaking Rosetta' JSALT 2017 Workshop. In: Proceedings of ICASSP 2018. Calgary: IEEE Signal Processing Society, 2018, pp. 4979-4983. ISBN 978-1-5386-4658-8.
Czech title
Objevování lingvistických jednotek z mutli-modálních vstupů v nepsaných jazycích - souhrn JSALT 2017 workshpou "řečová Rosettská deska"
Type
conference paper
Language
english
Authors
Scharenborg Odette (RUN)
Besacier Laurent (UGA)
Black Alan (CMU)
Hasegawa-Johnson Mark (UILLINOIS)
Metze Florian (CMU)
Neubig Graham (CMU)
Stüker Sebastian (KIT)
Godard Pierre (LIMSI)
Müller Markus (KIT)
Ondel Yang Lucas Antoine Francois, Mgr., Ph.D. (DCGM FIT BUT)
Palaskar Shruti (CMU)
Arthur Philip (CMU)
Ciannella Francesco (CMU)
Du Mingxing (INRIA)
Larsen Elin (INRIA)
Merkx Danny (RUN)
Riad Rachid (INRIA)
Wang Liming (UILLINOIS)
Dupoux Emmanuel (ENS)
URL
Keywords

unwritten languages, multi-modal data, unsupervised unit discovery, image retrieval, machine translation.

Abstract

We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding the discovery of linguistic units (subwords and words) in a language without orthography. We study the replacement of orthographic transcriptions by images and/or translated text in a well-resourced language to help unsupervised discovery from raw speech.

Published
2018
Pages
4979-4983
Proceedings
Proceedings of ICASSP 2018
Conference
IEEE International Conference on Acoustics, Speech and Signal Processing, Calgary, CA
ISBN
978-1-5386-4658-8
Publisher
IEEE Signal Processing Society
Place
Calgary, CA
DOI
UT WoS
000446384605030
EID Scopus
BibTeX
@INPROCEEDINGS{FITPUB11718,
   author = "Odette Scharenborg and Laurent Besacier and Alan Black and Mark Hasegawa-Johnson and Florian Metze and Graham Neubig and Sebastian St{\"{u}}ker and Pierre Godard and Markus M{\"{u}}ller and Francois Antoine Lucas Yang Ondel and Shruti Palaskar and Philip Arthur and Francesco Ciannella and Mingxing Du and Elin Larsen and Danny Merkx and Rachid Riad and Liming Wang and Emmanuel Dupoux",
   title = "Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the 'Speaking Rosetta' JSALT 2017 Workshop",
   pages = "4979--4983",
   booktitle = "Proceedings of ICASSP 2018",
   year = 2018,
   location = "Calgary, CA",
   publisher = "IEEE Signal Processing Society",
   ISBN = "978-1-5386-4658-8",
   doi = "10.1109/ICASSP.2018.8461761",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/11718"
}
Back to top