TY - JOUR
T1 - Magic
T2 - AN LLM-based multi-agent activated graph-reasoning intelligent collaboration model for liver disease diagnosis
AU - Liu, Bowen
AU - Nie, Yaqing
AU - Song, Hong
AU - Lin, Yucong
AU - Li, Jingtao
AU - Weng, Xutao
AU - Su, Zhaoli
AU - Suo, Yuhong
AU - Lv, Tingting
AU - Zhao, Xinyan
AU - Yang, Jian
N1 - Publisher Copyright:
© 2025
PY - 2026/2
Y1 - 2026/2
N2 - Large language models (LLMs) perform well in general medical fields, but their effective application in complex liver disease diagnosis remains an open question. We propose an LLM-based Multi-agent Activated Graph-reasoning Intelligent Collaboration (MAGIC) model to address this challenge. MAGIC enhances liver disease knowledge through multi-scale analysis, including similar case studies, abnormal indicator identification, and knowledge graph analysis. During the simulated clinical progressive diagnostic process, the model adjusts key nodes and relationship weights in the graph reasoning using multi-agent debate results, improving pre-diagnosis accuracy. Meanwhile, the model verifies the pre-diagnosis results with guidelines to ensure their alignment with established clinical standards, ultimately generating reliable diagnostic results. Extensive experiments demonstrated that MAGIC achieved accuracy of 94.5 % on the dataset LiverQ&A from Beijing Friendship Hospital, 11.39 % improvement in F1 over the best LLM-based SOTA model. And MAGIC achieved 91.6 % accuracy on the multi-center validation dataset, which included data from Beijing You'an Hospital and China-Japan Friendship Hospital. Additionally, on the public dataset MedQA, our approach improved the accuracy of a closed-source model by 1.7 % to 6.8 %.
AB - Large language models (LLMs) perform well in general medical fields, but their effective application in complex liver disease diagnosis remains an open question. We propose an LLM-based Multi-agent Activated Graph-reasoning Intelligent Collaboration (MAGIC) model to address this challenge. MAGIC enhances liver disease knowledge through multi-scale analysis, including similar case studies, abnormal indicator identification, and knowledge graph analysis. During the simulated clinical progressive diagnostic process, the model adjusts key nodes and relationship weights in the graph reasoning using multi-agent debate results, improving pre-diagnosis accuracy. Meanwhile, the model verifies the pre-diagnosis results with guidelines to ensure their alignment with established clinical standards, ultimately generating reliable diagnostic results. Extensive experiments demonstrated that MAGIC achieved accuracy of 94.5 % on the dataset LiverQ&A from Beijing Friendship Hospital, 11.39 % improvement in F1 over the best LLM-based SOTA model. And MAGIC achieved 91.6 % accuracy on the multi-center validation dataset, which included data from Beijing You'an Hospital and China-Japan Friendship Hospital. Additionally, on the public dataset MedQA, our approach improved the accuracy of a closed-source model by 1.7 % to 6.8 %.
KW - Graph reasoning
KW - Guidelines verification
KW - Large language models
KW - Liver disease diagnosis
KW - Multi-agent debate
KW - Multi-scale knowledge analysis
UR - http://www.scopus.com/pages/publications/105012251167
U2 - 10.1016/j.inffus.2025.103557
DO - 10.1016/j.inffus.2025.103557
M3 - Article
AN - SCOPUS:105012251167
SN - 1566-2535
VL - 126
JO - Information Fusion
JF - Information Fusion
M1 - 103557
ER -