TY - GEN
T1 - PFCA
T2 - 41st IEEE International Conference on Data Engineering, ICDE 2025
AU - Wang, Hao
AU - Shi, Jiyun
AU - Chen, Yuhao
AU - Xu, Haochen
AU - Zhang, Chi
AU - Luo, Zhaojing
AU - Zhang, Meihui
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Electronic health records (EHRs) store patient medical history in the structured data format, which facilitates automatic healthcare risk prediction, thereby improving personalized healthcare management and treatment. There are two main categories of methods for automatic healthcare risk prediction. The first models time-series information or relationships between visits for enhanced patient representations. However, given the high dimensionality nature of the EHR data, it often obtains compromise results due to the lack of training data. The second exploits external knowledge, e.g., knowledge graphs (KGs), to augment the training data, but less attention has been paid to distinguishing the importance of features and filtering out irrelevant external knowledge, leading to overwhelming noise and inefficiency. Additionally, the joint relationships between patient features were not emphasized, which are highlighted in clinical practice. In this paper, we propose an efficient Path Filtering with Causal Analysis (PFCA) approach for enhanced healthcare risk prediction to address these challenges. PFCA first extracts personalized knowledge graphs (PKGs) consisting of paths linking the patient's features to targets and then devises a fine-grained filtering method based on path messages to remove irrelevant paths for better efficiency. Then we develop an effective similarity-based method to model different features' joint interactions with targets to learn augmented representations for each feature. Furthermore, we design a causal analysis method that includes a novel causal intervention mechanism to mine and prioritize causal features for improved predictive performance. Finally, by exploiting the attention weights of paths in the PKGs, PFCA provides target-oriented interpretations, showing how patients' features lead to targets through significant paths. Experimental results on three public real-world datasets and four healthcare risk prediction tasks confirm PFCA's effectiveness in improving predictive performance compared to ten state-of-the-art baselines, demonstrate its efficiency of path filtering and interpretability.
AB - Electronic health records (EHRs) store patient medical history in the structured data format, which facilitates automatic healthcare risk prediction, thereby improving personalized healthcare management and treatment. There are two main categories of methods for automatic healthcare risk prediction. The first models time-series information or relationships between visits for enhanced patient representations. However, given the high dimensionality nature of the EHR data, it often obtains compromise results due to the lack of training data. The second exploits external knowledge, e.g., knowledge graphs (KGs), to augment the training data, but less attention has been paid to distinguishing the importance of features and filtering out irrelevant external knowledge, leading to overwhelming noise and inefficiency. Additionally, the joint relationships between patient features were not emphasized, which are highlighted in clinical practice. In this paper, we propose an efficient Path Filtering with Causal Analysis (PFCA) approach for enhanced healthcare risk prediction to address these challenges. PFCA first extracts personalized knowledge graphs (PKGs) consisting of paths linking the patient's features to targets and then devises a fine-grained filtering method based on path messages to remove irrelevant paths for better efficiency. Then we develop an effective similarity-based method to model different features' joint interactions with targets to learn augmented representations for each feature. Furthermore, we design a causal analysis method that includes a novel causal intervention mechanism to mine and prioritize causal features for improved predictive performance. Finally, by exploiting the attention weights of paths in the PKGs, PFCA provides target-oriented interpretations, showing how patients' features lead to targets through significant paths. Experimental results on three public real-world datasets and four healthcare risk prediction tasks confirm PFCA's effectiveness in improving predictive performance compared to ten state-of-the-art baselines, demonstrate its efficiency of path filtering and interpretability.
KW - causal analysis
KW - Efficient healthcare analytics
KW - Electronic health record
KW - interpretable analytics
KW - knowledge graphs
UR - http://www.scopus.com/pages/publications/105015532930
U2 - 10.1109/ICDE65448.2025.00176
DO - 10.1109/ICDE65448.2025.00176
M3 - Conference contribution
AN - SCOPUS:105015532930
T3 - Proceedings - International Conference on Data Engineering
SP - 2323
EP - 2336
BT - Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025
PB - IEEE Computer Society
Y2 - 19 May 2025 through 23 May 2025
ER -