Abstract #3388

Foundation Models for Multimodal MRI Synthesis with Language Guidance

Mahmut Yurt^1,2, Xiaozhi Cao³, Zihan Zhou³, Kawin Setsompop³, Shreyas Vasanawala^2,3, and John Pauly¹

¹Department of Electrical Engineering, Stanford University, Stanford, CA, United States, ²Cardiovascular Institute, Stanford University, Stanford, CA, United States, ³Department of Radiology, Stanford University, Stanford, CA, United States

Synopsis

Keywords: Language Models, AI/ML Image Reconstruction, Image Synthesis

Motivation: We aim to introduce a foundation model based on visual and textual inputs to enable robust, unified image synthesis in multimodal MRI.

Goal(s): Our goal is to demonstrate a versatile foundation model, with language guidance for accurate target descriptions, that adapts easily to new modalities and datasets, using computationally efficient fine-tuning strategies with minimal additional data and training.

Approach: Our approach conditions synthesis on source-modality images and target-modality text descriptions, via a text encoder to embed textual inputs, one-step latent diffusion model to perform fast synthesis, and low-rank adaptation for efficient fine-tuning.

Results: We demonstrated high-quality synthesis performance over various modalities and datasets.

Impact: Conventional synthesis models rely on image-to-image translation with just visual inputs and often show limited generalizability. We demonstrate a foundation model with language guidance that leverages textual inputs for improved adaptability to new modalities.

How to access this content:

For one year after publication, abstracts and videos are only open to registrants of this annual meeting. Registrants should use their existing login information. Non-registrant access can be purchased via the ISMRM E-Library.

After one year, current ISMRM & ISMRT members get free access to both the abstracts and videos. Non-members and non-registrants must purchase access via the ISMRM E-Library.

After two years, the meeting proceedings (abstracts) are opened to the public and require no login information. Videos remain behind password for access by members, registrants and E-Library customers.

Click here for more information on becoming a member.