Publication Details

Speech production under stress for machine learning: multimodal dataset of 79 cases and 8 signals

PEŠÁN Jan, JUŘÍK Vojtěch, RŮŽIČKOVÁ Alexandra, SVOBODA Vojtěch, JANOUŠEK Oto, NĚMCOVÁ Andrea, BOJANOVSKÁ Hana, ALDABAGHOVÁ Jasmína, KYSLÍK Filip, VODIČKOVÁ Kateřina, SODOMOVÁ Adéla, BARTYS Patrik, CHUDÝ Peter and ČERNOCKÝ Jan. Speech production under stress for machine learning: multimodal dataset of 79 cases and 8 signals. Nature Scientific Data, vol. 11, no. 1, 2024, pp. 1-9. ISSN 2052-4463. Available from: https://www.nature.com/articles/s41597-024-03991-w
Czech title
Tvorba řeči ve stresu pro strojové učení: multimodální dataset 79 mluvčích a 8 signálů
Type
journal article
Language
english
Authors
Pešán Jan, Ing. (DCGM FIT BUT)
Juřík Vojtěch (ICAECS FCE BUT)
Růžičková Alexandra (FF MUNI)
Svoboda Vojtěch (FF MUNI)
Janoušek Oto, Ing., Ph.D. (FEEC BUT)
Němcová Andrea, Ing. (DCSE FEECS BUT)
Bojanovská Hana (FF MUNI)
Aldabaghová Jasmína (FF MUNI)
Kyslík Filip (FF MUNI)
Vodičková Kateřina (FF MUNI)
Sodomová Adéla (FF MUNI)
Bartys Patrik (FF MUNI)
Chudý Peter, doc. Ing., Ph.D. MBA (DCGM FIT BUT)
Černocký Jan, prof. Dr. Ing. (DCGM FIT BUT)
URL
Keywords

speech, stress, machine learning

Abstract

Early identification of cognitive or physical overload is critical in fields where human decision making matters when preventing threats to safety and property. Pilots, drivers, surgeons, and operators of nuclear plants are among those affected by this challenge, as acute stress can impair their cognition. In this context, the significance of paralinguistic automatic speech processing increases for early stress detection. The intensity, intonation, and cadence of an utterance are examples of paralinguistic traits that determine the meaning of a sentence and are often lost in the verbatim transcript. To address this issue, tools are being developed to recognize paralinguistic traits effectively. However, a data bottleneck still exists in the training of paralinguistic speech traits, and the lack of high-quality reference data for the training of artificial systems persists. Regarding this, we present an original empirical dataset collected using the BESST experimental protocol for capturing speech signals under induced stress. With this data, our aim is to promote the development of pre-emptive intervention systems based on stress estimation from speech.

Published
2024
Pages
1-9
Journal
Nature Scientific Data, vol. 11, no. 1, ISSN 2052-4463
Publisher
Nature Portfolio Berlin
DOI
UT WoS
001353330000007
EID Scopus
BibTeX
@ARTICLE{FITPUB13308,
   author = "Jan Pe\v{s}\'{a}n and Vojt\v{e}ch Ju\v{r}\'{i}k and Alexandra R\r{u}\v{z}i\v{c}kov\'{a} and Vojt\v{e}ch Svoboda and Oto Janou\v{s}ek and Andrea N\v{e}mcov\'{a} and Hana Bojanovsk\'{a} and Jasm\'{i}na Aldabaghov\'{a} and Filip Kysl\'{i}k and Kate\v{r}ina Vodi\v{c}kov\'{a} and Ad\'{e}la Sodomov\'{a} and Patrik Bartys and Peter Chud\'{y} and Jan \v{C}ernock\'{y}",
   title = "Speech production under stress for machine learning: multimodal dataset of 79 cases and 8 signals",
   pages = "1--9",
   journal = "Nature Scientific Data",
   volume = 11,
   number = 1,
   year = 2024,
   ISSN = "2052-4463",
   doi = "10.1038/s41597-024-03991-w",
   language = "english",
   url = "https://www.fit.vut.cz/research/publication/13308"
}
Back to top