Keywords: Analysis/Processing, Alzheimer's Disease, multimodal, language-text
Motivation: The diagnosis of Alzheimer's disease (AD) considers not only clinical symptoms but also various data sources, including MR imaging.
Goal(s): In this study, we used a multimodal approach to integrate both language and vision information to improve the performance of clinical dementia rating classification network.
Approach: We used contrastive pre-training with language and vision data pairs; then trained a classifier, freezing the pre-trained network during classifier training.
Results: The results show that the integrated model achieved the highest accuracy. In addition, the contrastive learning process improved the performance of the vision encoder with guidance of abundant linguistic information.
Impact: With multimodal training, we successfully integrated both vision and language information and yielded the best results with integrated model. Also, multimodal training enhanced vision encoder's performance. When limited language information was provided, the complementary information from visual information was greater.
How to access this content:
For one year after publication, abstracts and videos are only open to registrants of this annual meeting. Registrants should use their existing login information. Non-registrant access can be purchased via the ISMRM E-Library.
After one year, current ISMRM & ISMRT members get free access to both the abstracts and videos. Non-members and non-registrants must purchase access via the ISMRM E-Library.
After two years, the meeting proceedings (abstracts) are opened to the public and require no login information. Videos remain behind password for access by members, registrants and E-Library customers.
Keywords