From sound to meaning: Brain-inspired deep neural networks for sound recognition

Research output: ThesisDoctoral ThesisInternal

Abstract

This thesis investigates how humans recognise and make sense of complex everyday sounds by integrating neuroscience and artificial intelligence (AI) insights. Specifically, it focuses on developing and evaluating deep neural network (DNN) models that simulate how the human brain processes auditory stimuli. The overarching goal is to bridge the gap between biological auditory and artificial hearing systems, using computational models that are both functionally effective and biologically plausible.

The first part of the research evaluates various auditory models—including traditional acoustic models and state-of-the-art DNNs—by comparing their ability to predict both brain activity (via functional MRI)iou and human behavral judgments of sound similarity. The results highlight that certain DNNs capture a level of auditory representation between raw acoustic features and abstract semantic categories. This intermediate representation, referred to as “hyperacoustic,” is particularly prominent in the superior temporal gyrus (STG), suggesting it plays a critical role in transforming sound into meaning.

The second part introduces semantically informed DNNs (semDNNs), which are trained using continuous semantic embeddings (Word2Vec and BERT) instead of traditional categorical labels with one-hot encoding. These models align more closely with human judgments and offer a cognitively inspired approach to sound categorisation, demonstrating the value of integrating linguistic knowledge into auditory models.

Finally, the third part presents a multiscale time-resolved DNN architecture that processes sound at multiple temporal scales and includes attention mechanisms to capture salient auditory events. This design mirrors the human auditory system’s hierarchical and dynamic nature and represents a significant step toward building models that can operate in real-world listening environments.

This work contributes to cognitive neuroscience, machine learning, and auditory artificial intelligence by introducing computational models that achieve strong technical performance and provide insights into how humans perceive, interpret, and understand the sounds in their environment.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Maastricht University
Supervisors/Advisors
  • Formisano, Elia, Supervisor
  • Valente, Giancarlo, Co-Supervisor
Award date8 May 2025
Place of PublicationMaastricht
Print ISBNs978-94-6473-803-2
Electronic ISBNs978-94-6473-803-2
Publication statusPublished - 8 May 2025

Fingerprint

Dive into the research topics of 'From sound to meaning: Brain-inspired deep neural networks for sound recognition'. Together they form a unique fingerprint.

Cite this