Robust inter-subject audiovisual decoding in functional magnetic resonance imaging using high-dimensional regression

Gal Raz*, Michele Svanera, Neomi Singer, Gadi Gilam, Maya Bleich Cohen, Tamar Lin, Roee Admon, Tal Gonen, Avner Thaler, Roni Y Granot, Rainer Goebel, Sergio Benini, Giancarlo Valente

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

60 Downloads (Pure)

Abstract

Major methodological advancements have been recently made in the field of neural decoding, which is concerned with the reconstruction of mental content from neuroimaging measures. However, in the absence of a large-scale examination of the validity of the decoding models across subjects and content, the extent to which these models can be generalized is not clear. This study addresses the challenge of producing generalizable decoding models, which allow the reconstruction of perceived audiovisual features from human magnetic resonance imaging (fMRI) data without prior training of the algorithm on the decoded content. We applied an adapted version of kernel ridge regression combined with temporal optimization on data acquired during film viewing (234 runs) to generate standardized brain models for sound loudness, speech presence, perceived motion, face-to-frame ratio, lightness, and color brightness. The prediction accuracies were tested on data collected from different subjects watching other movies mainly in another scanner. Substantial and significant (QFDR<0.05) correlations between the reconstructed and the original descriptors were found for the first three features (loudness, speech, and motion) in all of the 9 test movies (R¯=0.62, R¯ = 0.60, R¯ = 0.60, respectively) with high reproducibility of the predictors across subjects. The face ratio model produced significant correlations in 7 out of 8 movies (R¯=0.56). The lightness and brightness models did not show robustness (R¯=0.23, R¯ = 0). Further analysis of additional data (95 runs) indicated that loudness reconstruction veridicality can consistently reveal relevant group differences in musical experience. The findings point to the validity and generalizability of our loudness, speech, motion, and face ratio models for complex cinematic stimuli (as well as for music in the case of loudness). While future research should further validate these models using controlled stimuli and explore the feasibility of extracting more complex models via this method, the reliability of our results indicates the potential usefulness of the approach and the resulting models in basic scientific and diagnostic contexts.

Original languageEnglish
Pages (from-to)244-263
Number of pages20
JournalNeuroimage
Volume163
Early online date20 Sept 2017
DOIs
Publication statusPublished - Dec 2017

Keywords

  • fMRI
  • Audiovisual decoding
  • Motion pictures
  • Kernel ridge regression
  • Sound loudness
  • Optical flow
  • Face
  • HUMAN VISUAL-CORTEX
  • VENTRAL TEMPORAL CORTEX
  • HUMAN CORTICAL ANATOMY
  • HUMAN BRAIN ACTIVITY
  • RETINEX THEORY
  • REPRESENTATION
  • ALIGNMENT
  • MOTION
  • FMRI
  • EMOTIONS

Cite this