Meeting Banner
Abstract #3388

Foundation Models for Multimodal MRI Synthesis with Language Guidance

Mahmut Yurt1,2, Xiaozhi Cao3, Zihan Zhou3, Kawin Setsompop3, Shreyas Vasanawala2,3, and John Pauly1
1Department of Electrical Engineering, Stanford University, Stanford, CA, United States, 2Cardiovascular Institute, Stanford University, Stanford, CA, United States, 3Department of Radiology, Stanford University, Stanford, CA, United States

Synopsis

Keywords: Language Models, AI/ML Image Reconstruction, Image Synthesis

Motivation: We aim to introduce a foundation model based on visual and textual inputs to enable robust, unified image synthesis in multimodal MRI.

Goal(s): Our goal is to demonstrate a versatile foundation model, with language guidance for accurate target descriptions, that adapts easily to new modalities and datasets, using computationally efficient fine-tuning strategies with minimal additional data and training.

Approach: Our approach conditions synthesis on source-modality images and target-modality text descriptions, via a text encoder to embed textual inputs, one-step latent diffusion model to perform fast synthesis, and low-rank adaptation for efficient fine-tuning.

Results: We demonstrated high-quality synthesis performance over various modalities and datasets.

Impact: Conventional synthesis models rely on image-to-image translation with just visual inputs and often show limited generalizability. We demonstrate a foundation model with language guidance that leverages textual inputs for improved adaptability to new modalities.

How to access this content:

For one year after publication, abstracts and videos are only open to registrants of this annual meeting. Registrants should use their existing login information. Non-registrant access can be purchased via the ISMRM E-Library.

After one year, current ISMRM & ISMRT members get free access to both the abstracts and videos. Non-members and non-registrants must purchase access via the ISMRM E-Library.

After two years, the meeting proceedings (abstracts) are opened to the public and require no login information. Videos remain behind password for access by members, registrants and E-Library customers.

Click here for more information on becoming a member.

Keywords