PFCA: Efficient Path Filtering with Causal Analysis for Healthcare Risk Prediction

Hao Wang; Jiyun Shi; Yuhao Chen; Haochen Xu; Chi Zhang; Zhaojing Luo; Meihui Zhang

doi:10.1109/ICDE65448.2025.00176

PFCA: Efficient Path Filtering with Causal Analysis for Healthcare Risk Prediction

Hao Wang, Jiyun Shi, Yuhao Chen, Haochen Xu, Chi Zhang, Zhaojing Luo^*, Meihui Zhang

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

Abstract

Electronic health records (EHRs) store patient medical history in the structured data format, which facilitates automatic healthcare risk prediction, thereby improving personalized healthcare management and treatment. There are two main categories of methods for automatic healthcare risk prediction. The first models time-series information or relationships between visits for enhanced patient representations. However, given the high dimensionality nature of the EHR data, it often obtains compromise results due to the lack of training data. The second exploits external knowledge, e.g., knowledge graphs (KGs), to augment the training data, but less attention has been paid to distinguishing the importance of features and filtering out irrelevant external knowledge, leading to overwhelming noise and inefficiency. Additionally, the joint relationships between patient features were not emphasized, which are highlighted in clinical practice. In this paper, we propose an efficient Path Filtering with Causal Analysis (PFCA) approach for enhanced healthcare risk prediction to address these challenges. PFCA first extracts personalized knowledge graphs (PKGs) consisting of paths linking the patient's features to targets and then devises a fine-grained filtering method based on path messages to remove irrelevant paths for better efficiency. Then we develop an effective similarity-based method to model different features' joint interactions with targets to learn augmented representations for each feature. Furthermore, we design a causal analysis method that includes a novel causal intervention mechanism to mine and prioritize causal features for improved predictive performance. Finally, by exploiting the attention weights of paths in the PKGs, PFCA provides target-oriented interpretations, showing how patients' features lead to targets through significant paths. Experimental results on three public real-world datasets and four healthcare risk prediction tasks confirm PFCA's effectiveness in improving predictive performance compared to ten state-of-the-art baselines, demonstrate its efficiency of path filtering and interpretability.

Original language	English
Title of host publication	Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025
Publisher	IEEE Computer Society
Pages	2323-2336
Number of pages	14
ISBN (Electronic)	9798331536039
DOIs	http://doi.org/10.1109/ICDE65448.2025.00176
Publication status	Published - 2025
Event	41st IEEE International Conference on Data Engineering, ICDE 2025 - Hong Kong, China Duration: 19 May 2025 → 23 May 2025

Publication series

Name	Proceedings - International Conference on Data Engineering
ISSN (Print)	1084-4627
ISSN (Electronic)	2375-0286

Conference

Conference	41st IEEE International Conference on Data Engineering, ICDE 2025
Country/Territory	China
City	Hong Kong
Period	19/05/25 → 23/05/25

Keywords

causal analysis
Efficient healthcare analytics
Electronic health record
interpretable analytics
knowledge graphs

Access to Document

10.1109/ICDE65448.2025.00176

Cite this

Wang, H., Shi, J., Chen, Y., Xu, H., Zhang, C., Luo, Z., & Zhang, M. (2025). PFCA: Efficient Path Filtering with Causal Analysis for Healthcare Risk Prediction. In Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025 (pp. 2323-2336). (Proceedings - International Conference on Data Engineering). IEEE Computer Society. http://doi.org/10.1109/ICDE65448.2025.00176

@inproceedings{3a8dfce020164c2aa4f75028b84a7708,

title = "PFCA: Efficient Path Filtering with Causal Analysis for Healthcare Risk Prediction",

abstract = "Electronic health records (EHRs) store patient medical history in the structured data format, which facilitates automatic healthcare risk prediction, thereby improving personalized healthcare management and treatment. There are two main categories of methods for automatic healthcare risk prediction. The first models time-series information or relationships between visits for enhanced patient representations. However, given the high dimensionality nature of the EHR data, it often obtains compromise results due to the lack of training data. The second exploits external knowledge, e.g., knowledge graphs (KGs), to augment the training data, but less attention has been paid to distinguishing the importance of features and filtering out irrelevant external knowledge, leading to overwhelming noise and inefficiency. Additionally, the joint relationships between patient features were not emphasized, which are highlighted in clinical practice. In this paper, we propose an efficient Path Filtering with Causal Analysis (PFCA) approach for enhanced healthcare risk prediction to address these challenges. PFCA first extracts personalized knowledge graphs (PKGs) consisting of paths linking the patient's features to targets and then devises a fine-grained filtering method based on path messages to remove irrelevant paths for better efficiency. Then we develop an effective similarity-based method to model different features' joint interactions with targets to learn augmented representations for each feature. Furthermore, we design a causal analysis method that includes a novel causal intervention mechanism to mine and prioritize causal features for improved predictive performance. Finally, by exploiting the attention weights of paths in the PKGs, PFCA provides target-oriented interpretations, showing how patients' features lead to targets through significant paths. Experimental results on three public real-world datasets and four healthcare risk prediction tasks confirm PFCA's effectiveness in improving predictive performance compared to ten state-of-the-art baselines, demonstrate its efficiency of path filtering and interpretability.",

keywords = "causal analysis, Efficient healthcare analytics, Electronic health record, interpretable analytics, knowledge graphs",

author = "Hao Wang and Jiyun Shi and Yuhao Chen and Haochen Xu and Chi Zhang and Zhaojing Luo and Meihui Zhang",

note = "Publisher Copyright: {\textcopyright} 2025 IEEE.; 41st IEEE International Conference on Data Engineering, ICDE 2025 ; Conference date: 19-05-2025 Through 23-05-2025",

year = "2025",

doi = "10.1109/ICDE65448.2025.00176",

language = "English",

series = "Proceedings - International Conference on Data Engineering",

publisher = "IEEE Computer Society",

pages = "2323--2336",

booktitle = "Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025",

address = "United States",

}

Wang, H, Shi, J, Chen, Y, Xu, H, Zhang, C, Luo, Z & Zhang, M 2025, PFCA: Efficient Path Filtering with Causal Analysis for Healthcare Risk Prediction. in Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025. Proceedings - International Conference on Data Engineering, IEEE Computer Society, pp. 2323-2336, 41st IEEE International Conference on Data Engineering, ICDE 2025, Hong Kong, China, 19/05/25. http://doi.org/10.1109/ICDE65448.2025.00176

PFCA: Efficient Path Filtering with Causal Analysis for Healthcare Risk Prediction. / Wang, Hao; Shi, Jiyun; Chen, Yuhao et al.
Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025. IEEE Computer Society, 2025. p. 2323-2336 (Proceedings - International Conference on Data Engineering).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review

TY - GEN

T1 - PFCA

T2 - 41st IEEE International Conference on Data Engineering, ICDE 2025

AU - Wang, Hao

AU - Shi, Jiyun

AU - Chen, Yuhao

AU - Xu, Haochen

AU - Zhang, Chi

AU - Luo, Zhaojing

AU - Zhang, Meihui

PY - 2025

Y1 - 2025

N2 - Electronic health records (EHRs) store patient medical history in the structured data format, which facilitates automatic healthcare risk prediction, thereby improving personalized healthcare management and treatment. There are two main categories of methods for automatic healthcare risk prediction. The first models time-series information or relationships between visits for enhanced patient representations. However, given the high dimensionality nature of the EHR data, it often obtains compromise results due to the lack of training data. The second exploits external knowledge, e.g., knowledge graphs (KGs), to augment the training data, but less attention has been paid to distinguishing the importance of features and filtering out irrelevant external knowledge, leading to overwhelming noise and inefficiency. Additionally, the joint relationships between patient features were not emphasized, which are highlighted in clinical practice. In this paper, we propose an efficient Path Filtering with Causal Analysis (PFCA) approach for enhanced healthcare risk prediction to address these challenges. PFCA first extracts personalized knowledge graphs (PKGs) consisting of paths linking the patient's features to targets and then devises a fine-grained filtering method based on path messages to remove irrelevant paths for better efficiency. Then we develop an effective similarity-based method to model different features' joint interactions with targets to learn augmented representations for each feature. Furthermore, we design a causal analysis method that includes a novel causal intervention mechanism to mine and prioritize causal features for improved predictive performance. Finally, by exploiting the attention weights of paths in the PKGs, PFCA provides target-oriented interpretations, showing how patients' features lead to targets through significant paths. Experimental results on three public real-world datasets and four healthcare risk prediction tasks confirm PFCA's effectiveness in improving predictive performance compared to ten state-of-the-art baselines, demonstrate its efficiency of path filtering and interpretability.

AB - Electronic health records (EHRs) store patient medical history in the structured data format, which facilitates automatic healthcare risk prediction, thereby improving personalized healthcare management and treatment. There are two main categories of methods for automatic healthcare risk prediction. The first models time-series information or relationships between visits for enhanced patient representations. However, given the high dimensionality nature of the EHR data, it often obtains compromise results due to the lack of training data. The second exploits external knowledge, e.g., knowledge graphs (KGs), to augment the training data, but less attention has been paid to distinguishing the importance of features and filtering out irrelevant external knowledge, leading to overwhelming noise and inefficiency. Additionally, the joint relationships between patient features were not emphasized, which are highlighted in clinical practice. In this paper, we propose an efficient Path Filtering with Causal Analysis (PFCA) approach for enhanced healthcare risk prediction to address these challenges. PFCA first extracts personalized knowledge graphs (PKGs) consisting of paths linking the patient's features to targets and then devises a fine-grained filtering method based on path messages to remove irrelevant paths for better efficiency. Then we develop an effective similarity-based method to model different features' joint interactions with targets to learn augmented representations for each feature. Furthermore, we design a causal analysis method that includes a novel causal intervention mechanism to mine and prioritize causal features for improved predictive performance. Finally, by exploiting the attention weights of paths in the PKGs, PFCA provides target-oriented interpretations, showing how patients' features lead to targets through significant paths. Experimental results on three public real-world datasets and four healthcare risk prediction tasks confirm PFCA's effectiveness in improving predictive performance compared to ten state-of-the-art baselines, demonstrate its efficiency of path filtering and interpretability.

KW - causal analysis

KW - Efficient healthcare analytics

KW - Electronic health record

KW - interpretable analytics

KW - knowledge graphs

UR - http://www.scopus.com/pages/publications/105015532930

U2 - 10.1109/ICDE65448.2025.00176

DO - 10.1109/ICDE65448.2025.00176

M3 - Conference contribution

AN - SCOPUS:105015532930

T3 - Proceedings - International Conference on Data Engineering

SP - 2323

EP - 2336

BT - Proceedings - 2025 IEEE 41st International Conference on Data Engineering, ICDE 2025

PB - IEEE Computer Society

Y2 - 19 May 2025 through 23 May 2025

ER -

PFCA: Efficient Path Filtering with Causal Analysis for Healthcare Risk Prediction

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this