Abstract #4298

Cross-modal brain decoding: using fMRI to decode video stimuli from integrated sensory streams

Matteo Ferrante¹, Tommaso Boccato¹, and Nicola Toschi¹

¹University of Rome Tor Vergata, Rome, Italy

Synopsis

Keywords: Analysis/Processing, AI/ML Software, brain decoding, fMRI

Motivation: Understanding how the human brain decodes complex video stimuli can improve our insights into multisensory integration and advance brain-computer interfaces.

Goal(s): This study aims to enhance video decoding from fMRI data by integrating visual, textual, and audio streams, hypothesizing that each stream independently contributes to brain representation.

Approach: Using the BOLDMoments dataset of video-fMRI pairs, we developed cross-subject decoding models able to retrieved seen videos from fMRI activity, estimating multimodal embeddings from pre-trained models (XCLIP, CLAP) directly from preprocessed fMRI data.

Results: Multimodal integration (Video+Text+Audio) achieved the highest retrieval and identification accuracy, underscoring the advantage of combined sensory information for decoding.

Impact: This work opens pathways for more accurate brain decoding in multisensory contexts, potentially advancing brain-computer interfaces and aiding clinical applications in sensory processing disorders.

How to access this content:

For one year after publication, abstracts and videos are only open to registrants of this annual meeting. Registrants should use their existing login information. Non-registrant access can be purchased via the ISMRM E-Library.

After one year, current ISMRM & ISMRT members get free access to both the abstracts and videos. Non-members and non-registrants must purchase access via the ISMRM E-Library.

After two years, the meeting proceedings (abstracts) are opened to the public and require no login information. Videos remain behind password for access by members, registrants and E-Library customers.

Click here for more information on becoming a member.