Abstract
Deep neural networks (DNNs) for sound recognition learn to categorize a barking sound as a "dog"and a meowing sound as a "cat"but do not exploit information inherent to the semantic relations between classes (e.g., both are animal vocalisations). Cognitive neuroscience research, however, suggests that human listeners automatically exploit higher-level semantic information on the sources besides acoustic information. Inspired by this notion, we introduce here a DNN that learns to recognize sounds and simultaneously learns the semantic relation between the sources (semDNN). Comparison of semDNN with a homologous network trained with categorical labels (catDNN) revealed that semDNN produces semantically more accurate labelling than catDNN in sound recognition tasks and that semDNN-embeddings preserve higherlevel semantic relations between sound sources. Importantly, through a model-based analysis of human dissimilarity ratings of natural sounds, we show that semDNN approximates the behaviour of human listeners better than catDNN and several other DNN and NLP comparison models.
| Original language | English |
|---|---|
| Title of host publication | ICASSP 2023 - 2023 IEEE International Conference on Acoustics, Speech and Signal Processing, Proceedings |
| Publisher | IEEE |
| ISBN (Electronic) | 9781728163277 |
| DOIs | |
| Publication status | Published - 5 May 2023 |
| Event | 48th IEEE International Conference on Acoustics, Speech and Signal Processing - Rhodes Island, Greece Duration: 4 Jun 2023 → 10 Jun 2023 https://2023.ieeeicassp.org/ |
Publication series
| Series | ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings |
|---|---|
| Volume | 2023-June |
| ISSN | 1520-6149 |
Conference
| Conference | 48th IEEE International Conference on Acoustics, Speech and Signal Processing |
|---|---|
| Abbreviated title | ICASSP 2023 |
| Country/Territory | Greece |
| City | Rhodes Island |
| Period | 4/06/23 → 10/06/23 |
| Internet address |
Keywords
- acoustic-to-semantic transformation
- auditory semantics
- deep neural networks
- natural sound recognition
- semantic embeddings
Fingerprint
Dive into the research topics of 'Semantically-Informed Deep Neural Networks For Sound Recognition'. Together they form a unique fingerprint.Research output
- 1 Doctoral Thesis
-
From sound to meaning: brain-inspired deep neural networks for sound recognition
Esposito, M., 8 May 2025, Maastricht: Maastricht University. 196 p.Research output: Thesis › Doctoral Thesis › Internal
Open AccessFile215 Downloads (Pure)
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver