Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception

Prachi Patel; Kiki van der Heijden; Stephan Bickel; Jose L Herrero; Ashesh D Mehta; Nima Mesgarani

doi:10.1016/j.cub.2022.07.047

Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception

Prachi Patel, Kiki van der Heijden, Stephan Bickel, Jose L Herrero, Ashesh D Mehta, Nima Mesgarani^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › Academic › peer-review

Abstract

How the human auditory cortex represents spatially separated simultaneous talkers and how talkers' locations and voices modulate the neural representations of attended and unattended speech are unclear. Here, we measured the neural responses from electrodes implanted in neurosurgical patients as they performed single-talker and multi-talker speech perception tasks. We found that spatial separation between talkers caused a preferential encoding of the contralateral speech in Heschl's gyrus (HG), planum temporale (PT), and superior temporal gyrus (STG). Location and spectrotemporal features were encoded in different aspects of the neural response. Specifically, the talker's location changed the mean response level, whereas the talker's spectrotemporal features altered the variation of response around response's baseline. These components were differentially modulated by the attended talker's voice or location, which improved the population decoding of attended speech features. Attentional modulation due to the talker's voice only appeared in the auditory areas with longer latencies, but attentional modulation due to location was present throughout. Our results show that spatial multi-talker speech perception relies upon a separable pre-attentive neural representation, which could be further tuned by top-down attention to the location and voice of the talker.

Original language	English
Pages (from-to)	3971-3986.e4
Number of pages	21
Journal	Current Biology
Volume	32
Issue number	18
Early online date	11 Aug 2022
DOIs	https://doi.org/10.1016/j.cub.2022.07.047
Publication status	Published - 26 Sept 2022

Keywords

Cocktail party
Features
Neurons
Pitch
Primary auditory-cortex
Representations
Selective attention
Sensitivity
Spectrotemporal receptive-fields
Time

Access to Document

10.1016/j.cub.2022.07.047Licence: Free access - publisher

Cite this

@article{ab8dda42f72747e0bc02ed66b833fba0,

title = "Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception",

abstract = "How the human auditory cortex represents spatially separated simultaneous talkers and how talkers' locations and voices modulate the neural representations of attended and unattended speech are unclear. Here, we measured the neural responses from electrodes implanted in neurosurgical patients as they performed single-talker and multi-talker speech perception tasks. We found that spatial separation between talkers caused a preferential encoding of the contralateral speech in Heschl's gyrus (HG), planum temporale (PT), and superior temporal gyrus (STG). Location and spectrotemporal features were encoded in different aspects of the neural response. Specifically, the talker's location changed the mean response level, whereas the talker's spectrotemporal features altered the variation of response around response's baseline. These components were differentially modulated by the attended talker's voice or location, which improved the population decoding of attended speech features. Attentional modulation due to the talker's voice only appeared in the auditory areas with longer latencies, but attentional modulation due to location was present throughout. Our results show that spatial multi-talker speech perception relies upon a separable pre-attentive neural representation, which could be further tuned by top-down attention to the location and voice of the talker.",

keywords = "Cocktail party, Features, Neurons, Pitch, Primary auditory-cortex, Representations, Selective attention, Sensitivity, Spectrotemporal receptive-fields, Time",

author = "Prachi Patel and {van der Heijden}, Kiki and Stephan Bickel and Herrero, {Jose L} and Mehta, {Ashesh D} and Nima Mesgarani",

note = "Funding Information: We thank Richard Thompson Lee for their comments on the manuscript text. This work was supported by National Institutes of Health grant R01DC018805 and National Institute on Deafness and Other Communication Disorders grant R01DC014279 . Publisher Copyright: {\textcopyright} 2022 Elsevier Inc.",

year = "2022",

month = sep,

day = "26",

doi = "10.1016/j.cub.2022.07.047",

language = "English",

volume = "32",

pages = "3971--3986.e4",

journal = "Current Biology",

issn = "0960-9822",

publisher = "Cell Press",

number = "18",

}

TY - JOUR

T1 - Interaction of bottom-up and top-down neural mechanisms in spatial multi-talker speech perception

AU - Patel, Prachi

AU - van der Heijden, Kiki

AU - Bickel, Stephan

AU - Herrero, Jose L

AU - Mehta, Ashesh D

AU - Mesgarani, Nima

N1 - Funding Information: We thank Richard Thompson Lee for their comments on the manuscript text. This work was supported by National Institutes of Health grant R01DC018805 and National Institute on Deafness and Other Communication Disorders grant R01DC014279 . Publisher Copyright: © 2022 Elsevier Inc.

PY - 2022/9/26

Y1 - 2022/9/26

N2 - How the human auditory cortex represents spatially separated simultaneous talkers and how talkers' locations and voices modulate the neural representations of attended and unattended speech are unclear. Here, we measured the neural responses from electrodes implanted in neurosurgical patients as they performed single-talker and multi-talker speech perception tasks. We found that spatial separation between talkers caused a preferential encoding of the contralateral speech in Heschl's gyrus (HG), planum temporale (PT), and superior temporal gyrus (STG). Location and spectrotemporal features were encoded in different aspects of the neural response. Specifically, the talker's location changed the mean response level, whereas the talker's spectrotemporal features altered the variation of response around response's baseline. These components were differentially modulated by the attended talker's voice or location, which improved the population decoding of attended speech features. Attentional modulation due to the talker's voice only appeared in the auditory areas with longer latencies, but attentional modulation due to location was present throughout. Our results show that spatial multi-talker speech perception relies upon a separable pre-attentive neural representation, which could be further tuned by top-down attention to the location and voice of the talker.

AB - How the human auditory cortex represents spatially separated simultaneous talkers and how talkers' locations and voices modulate the neural representations of attended and unattended speech are unclear. Here, we measured the neural responses from electrodes implanted in neurosurgical patients as they performed single-talker and multi-talker speech perception tasks. We found that spatial separation between talkers caused a preferential encoding of the contralateral speech in Heschl's gyrus (HG), planum temporale (PT), and superior temporal gyrus (STG). Location and spectrotemporal features were encoded in different aspects of the neural response. Specifically, the talker's location changed the mean response level, whereas the talker's spectrotemporal features altered the variation of response around response's baseline. These components were differentially modulated by the attended talker's voice or location, which improved the population decoding of attended speech features. Attentional modulation due to the talker's voice only appeared in the auditory areas with longer latencies, but attentional modulation due to location was present throughout. Our results show that spatial multi-talker speech perception relies upon a separable pre-attentive neural representation, which could be further tuned by top-down attention to the location and voice of the talker.

KW - Cocktail party

KW - Features

KW - Neurons

KW - Pitch

KW - Primary auditory-cortex

KW - Representations

KW - Selective attention

KW - Sensitivity

KW - Spectrotemporal receptive-fields

KW - Time

U2 - 10.1016/j.cub.2022.07.047

DO - 10.1016/j.cub.2022.07.047

M3 - Article

C2 - 35973430

SN - 0960-9822

VL - 32

SP - 3971-3986.e4

JO - Current Biology

JF - Current Biology

IS - 18

ER -