UniCross: Balanced Multimodal Learning for Alzheimer’s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction

Lisong Yin; Chuyang Ye; Tiantian Liu; Jinglong Wu; Tianyi Yan

doi:10.1007/978-3-032-05182-0_62

UniCross: Balanced Multimodal Learning for Alzheimer’s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction

Lisong Yin, Chuyang Ye, Tiantian Liu^*, Jinglong Wu, Tianyi Yan^*

^*此作品的通讯作者

Beijing Institute of Technology

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

摘要

Early and accurate diagnosis of Alzheimer’s disease (AD) is crucial for effective treatment and patient care. In clinical practice, physicians can achieve precise diagnoses through the integration of multimodal image information, and it is desired to develop automated diagnosis approaches based on the multimodal information. However, existing multimodal deep learning methods face a critical paradox: although models excel at leveraging joint features to improve task performance, they often neglect the optimization of independent representation capabilities for uni-modal. This shortcoming, known as Modality Laziness, stems from imbalanced modality contributions within conventional joint training frameworks, where models predominantly rely on dominant modalities and neglect to learn weaker ones. To address this challenge, we propose UniCross, a novel balanced multimodal learning paradigm. Specifically, UniCross employs separate learning pathways with specialized training objectives for each modality to ensure comprehensive uni-modal feature learning. In addition, we design a Metadata Weighted Contrastive Loss (MWCL) to facilitate effective cross-modal information interaction. The MWCL leverages patient metadata (e.g., age, gender, and years of education) to adaptively calibrate both cross-modal and intra-modal feature distances between individuals. We validated our approach through extensive experiments on the ADNI dataset, using structural MRI and FDG-PET modalities for AD diagnosis and mild cognitive impairment (MCI) conversion prediction tasks. The results demonstrate that UniCross not only achieves state-of-the-art overall performance, but also significantly improves the diagnosis performance when only a single modality is available. Our code is available at http://github.com/Alita-song/UniCross

源语言	英语
主期刊名	Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings
编辑	James C. Gee, Jaesung Hong, Carole H. Sudre, Polina Golland, Daniel C. Alexander, Juan Eugenio Iglesias, Archana Venkataraman, Jong Hyo Kim
出版商	Springer Science and Business Media Deutschland GmbH
页	638-648
页数	11
ISBN（印刷版）	9783032051813
DOI	http://doi.org/10.1007/978-3-032-05182-0_62
出版状态	已出版 - 2026
已对外发布	是
活动	28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - Daejeon, 韩国期限: 23 9月 2025 → 27 9月 2025

出版系列

姓名	Lecture Notes in Computer Science
卷	15974 LNCS
ISSN（印刷版）	0302-9743
ISSN（电子版）	1611-3349

会议

会议	28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025
国家/地区	韩国
市	Daejeon
时期	23/09/25 → 27/09/25

访问文件

10.1007/978-3-032-05182-0_62

其它文件与链接

链接到 Scopus 的出版物

引用此

Yin, L., Ye, C., Liu, T., Wu, J., & Yan, T. (2026). UniCross: Balanced Multimodal Learning for Alzheimer’s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction. 在 J. C. Gee, J. Hong, C. H. Sudre, P. Golland, D. C. Alexander, J. E. Iglesias, A. Venkataraman, & J. H. Kim (编辑), Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings (页码 638-648). (Lecture Notes in Computer Science; 卷 15974 LNCS). Springer Science and Business Media Deutschland GmbH. http://doi.org/10.1007/978-3-032-05182-0_62

Yin, Lisong ; Ye, Chuyang ; Liu, Tiantian 等. / UniCross : Balanced Multimodal Learning for Alzheimer’s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction. Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings. 编辑 / James C. Gee ; Jaesung Hong ; Carole H. Sudre ; Polina Golland ; Daniel C. Alexander ; Juan Eugenio Iglesias ; Archana Venkataraman ; Jong Hyo Kim. Springer Science and Business Media Deutschland GmbH, 2026. 页码 638-648 (Lecture Notes in Computer Science).

@inproceedings{7503a557535049119278cb7a5d76ef23,

title = "UniCross: Balanced Multimodal Learning for Alzheimer{\textquoteright}s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction",

abstract = "Early and accurate diagnosis of Alzheimer{\textquoteright}s disease (AD) is crucial for effective treatment and patient care. In clinical practice, physicians can achieve precise diagnoses through the integration of multimodal image information, and it is desired to develop automated diagnosis approaches based on the multimodal information. However, existing multimodal deep learning methods face a critical paradox: although models excel at leveraging joint features to improve task performance, they often neglect the optimization of independent representation capabilities for uni-modal. This shortcoming, known as Modality Laziness, stems from imbalanced modality contributions within conventional joint training frameworks, where models predominantly rely on dominant modalities and neglect to learn weaker ones. To address this challenge, we propose UniCross, a novel balanced multimodal learning paradigm. Specifically, UniCross employs separate learning pathways with specialized training objectives for each modality to ensure comprehensive uni-modal feature learning. In addition, we design a Metadata Weighted Contrastive Loss (MWCL) to facilitate effective cross-modal information interaction. The MWCL leverages patient metadata (e.g., age, gender, and years of education) to adaptively calibrate both cross-modal and intra-modal feature distances between individuals. We validated our approach through extensive experiments on the ADNI dataset, using structural MRI and FDG-PET modalities for AD diagnosis and mild cognitive impairment (MCI) conversion prediction tasks. The results demonstrate that UniCross not only achieves state-of-the-art overall performance, but also significantly improves the diagnosis performance when only a single modality is available. Our code is available at http://github.com/Alita-song/UniCross",

keywords = "Alzheimer{\textquoteright}s Disease, Balanced Multimodal Learning, Contrastive Learning",

author = "Lisong Yin and Chuyang Ye and Tiantian Liu and Jinglong Wu and Tianyi Yan",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.; 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 ; Conference date: 23-09-2025 Through 27-09-2025",

year = "2026",

doi = "10.1007/978-3-032-05182-0\_62",

language = "English",

isbn = "9783032051813",

series = "Lecture Notes in Computer Science",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "638--648",

editor = "Gee, \{James C.\} and Jaesung Hong and Sudre, \{Carole H.\} and Polina Golland and Alexander, \{Daniel C.\} and Iglesias, \{Juan Eugenio\} and Archana Venkataraman and Kim, \{Jong Hyo\}",

booktitle = "Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings",

address = "Germany",

}

Yin, L, Ye, C, Liu, T, Wu, J & Yan, T 2026, UniCross: Balanced Multimodal Learning for Alzheimer’s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction. 在 JC Gee, J Hong, CH Sudre, P Golland, DC Alexander, JE Iglesias, A Venkataraman & JH Kim (编辑), Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings. Lecture Notes in Computer Science, 卷 15974 LNCS, Springer Science and Business Media Deutschland GmbH, 页码 638-648, 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025, Daejeon, 韩国, 23/09/25. http://doi.org/10.1007/978-3-032-05182-0_62

UniCross: Balanced Multimodal Learning for Alzheimer’s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction. / Yin, Lisong; Ye, Chuyang; Liu, Tiantian 等.
Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings. 编辑 / James C. Gee; Jaesung Hong; Carole H. Sudre; Polina Golland; Daniel C. Alexander; Juan Eugenio Iglesias; Archana Venkataraman; Jong Hyo Kim. Springer Science and Business Media Deutschland GmbH, 2026. 页码 638-648 (Lecture Notes in Computer Science; 卷 15974 LNCS).

科研成果: 书/报告/会议事项章节 › 会议稿件 › 同行评审

TY - GEN

T1 - UniCross

T2 - 28th International Conference on Medical Image Computing and Computer Assisted Intervention, MICCAI 2025

AU - Yin, Lisong

AU - Ye, Chuyang

AU - Liu, Tiantian

AU - Wu, Jinglong

AU - Yan, Tianyi

N1 - Publisher Copyright: © The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.

PY - 2026

Y1 - 2026

N2 - Early and accurate diagnosis of Alzheimer’s disease (AD) is crucial for effective treatment and patient care. In clinical practice, physicians can achieve precise diagnoses through the integration of multimodal image information, and it is desired to develop automated diagnosis approaches based on the multimodal information. However, existing multimodal deep learning methods face a critical paradox: although models excel at leveraging joint features to improve task performance, they often neglect the optimization of independent representation capabilities for uni-modal. This shortcoming, known as Modality Laziness, stems from imbalanced modality contributions within conventional joint training frameworks, where models predominantly rely on dominant modalities and neglect to learn weaker ones. To address this challenge, we propose UniCross, a novel balanced multimodal learning paradigm. Specifically, UniCross employs separate learning pathways with specialized training objectives for each modality to ensure comprehensive uni-modal feature learning. In addition, we design a Metadata Weighted Contrastive Loss (MWCL) to facilitate effective cross-modal information interaction. The MWCL leverages patient metadata (e.g., age, gender, and years of education) to adaptively calibrate both cross-modal and intra-modal feature distances between individuals. We validated our approach through extensive experiments on the ADNI dataset, using structural MRI and FDG-PET modalities for AD diagnosis and mild cognitive impairment (MCI) conversion prediction tasks. The results demonstrate that UniCross not only achieves state-of-the-art overall performance, but also significantly improves the diagnosis performance when only a single modality is available. Our code is available at http://github.com/Alita-song/UniCross

AB - Early and accurate diagnosis of Alzheimer’s disease (AD) is crucial for effective treatment and patient care. In clinical practice, physicians can achieve precise diagnoses through the integration of multimodal image information, and it is desired to develop automated diagnosis approaches based on the multimodal information. However, existing multimodal deep learning methods face a critical paradox: although models excel at leveraging joint features to improve task performance, they often neglect the optimization of independent representation capabilities for uni-modal. This shortcoming, known as Modality Laziness, stems from imbalanced modality contributions within conventional joint training frameworks, where models predominantly rely on dominant modalities and neglect to learn weaker ones. To address this challenge, we propose UniCross, a novel balanced multimodal learning paradigm. Specifically, UniCross employs separate learning pathways with specialized training objectives for each modality to ensure comprehensive uni-modal feature learning. In addition, we design a Metadata Weighted Contrastive Loss (MWCL) to facilitate effective cross-modal information interaction. The MWCL leverages patient metadata (e.g., age, gender, and years of education) to adaptively calibrate both cross-modal and intra-modal feature distances between individuals. We validated our approach through extensive experiments on the ADNI dataset, using structural MRI and FDG-PET modalities for AD diagnosis and mild cognitive impairment (MCI) conversion prediction tasks. The results demonstrate that UniCross not only achieves state-of-the-art overall performance, but also significantly improves the diagnosis performance when only a single modality is available. Our code is available at http://github.com/Alita-song/UniCross

KW - Alzheimer’s Disease

KW - Balanced Multimodal Learning

KW - Contrastive Learning

UR - http://www.scopus.com/pages/publications/105017956515

U2 - 10.1007/978-3-032-05182-0_62

DO - 10.1007/978-3-032-05182-0_62

M3 - Conference contribution

AN - SCOPUS:105017956515

SN - 9783032051813

T3 - Lecture Notes in Computer Science

SP - 638

EP - 648

BT - Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings

A2 - Gee, James C.

A2 - Hong, Jaesung

A2 - Sudre, Carole H.

A2 - Golland, Polina

A2 - Alexander, Daniel C.

A2 - Iglesias, Juan Eugenio

A2 - Venkataraman, Archana

A2 - Kim, Jong Hyo

PB - Springer Science and Business Media Deutschland GmbH

Y2 - 23 September 2025 through 27 September 2025

ER -

Yin L, Ye C, Liu T, Wu J , Yan T. UniCross: Balanced Multimodal Learning for Alzheimer’s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction. 在 Gee JC, Hong J, Sudre CH, Golland P, Alexander DC, Iglesias JE, Venkataraman A, Kim JH, 编辑, Medical Image Computing and Computer Assisted Intervention, MICCAI 2025 - 28th International Conference, Proceedings. Springer Science and Business Media Deutschland GmbH. 2026. 页码 638-648. (Lecture Notes in Computer Science). doi: 10.1007/978-3-032-05182-0_62

UniCross: Balanced Multimodal Learning for Alzheimer’s Disease Diagnosis by Uni-modal Separation and Metadata-Guided Cross-Modal Interaction

摘要

出版系列

会议

访问文件

其它文件与链接

指纹

引用此