Publication Details
Deriving Spectro-temporal Properties of Hearing from Speech Data
Li Ruizhi (JHU)
Sell Gregory (JHU)
Heřmanský Hynek, prof. Ing., Dr.Eng. (JHU)
perception, spectro-temporal, auditory, deep learning
Human hearing and human speech are intrinsically tied together, as the properties of speech almost certainly developed in order to be heard by human ears. As a result of this connection, it has been shown that certain properties of human hearing are mimicked within data-driven systems that are trained to understand human speech. In this paper, we further explore this phenomenon by measuring the spectro-temporal responses of data-derived filters in a front-end convolutional layer of a deep network trained to classify the phonemes of clean speech. The analyses show that the filters do indeed exhibit spectro-temporal responses similar to those measured in mammals, and also that the filters exhibit an additional level of frequency selectivity, similar to the processing pipeline assumed within the Articulation Index.
@INPROCEEDINGS{FITPUB12097, author = "Francois Antoine Lucas Yang Ondel and Ruizhi Li and Gregory Sell and Hynek He\v{r}mansk\'{y}", title = "Deriving Spectro-temporal Properties of Hearing from Speech Data", pages = "411--415", booktitle = "Proceedings of ICASSP", year = 2019, location = "Brighton, GB", publisher = "IEEE Signal Processing Society", ISBN = "978-1-5386-4658-8", doi = "10.1109/ICASSP.2019.8682787", language = "english", url = "https://www.fit.vut.cz/research/publication/12097" }