TY - JOUR
T1 - Authenticated Subgraph Matching on Large-Scale Graphs in Hybrid-Storage Blockchains
AU - Li, Siyu
AU - Zhang, Zhiwei
AU - Zhong, Kai
AU - Zhao, Kangfei
AU - Zhang, Meihui
AU - Yuan, Ye
AU - Wang, Guoren
N1 - Publisher Copyright:
© 1989-2012 IEEE.
PY - 2025/11
Y1 - 2025/11
N2 - Graphs serve as an essential data structure to model complex relationships in a variety of applications, such as social networks, web graphs, and chemical informatics. Due to the high cost of maintaining large-scale graph data and executing graph queries, data owners often outsource their graph data to a third-party service provider for graph processing. In this scenario, it is crucial to ensure the integrity of query results, as the provider may have the incentive to return only partial or tampered results to save computing resources or serve their own interests. Blockchain, as a promising solution for secure data storage and retrieval, opens up new opportunities for data management in such scenarios. To scale the blockchain, existing studies have concentrated on using off-chain storage while ensuring the integrity of query results for key-value data in hybrid-storage blockchain architectures. To the best of our knowledge, there is no work to enable the blockchain to support subgraph matching queries. In this paper, we first study the problem of authenticated subgraph matching queries. Traditional subgraph matching algorithms follow the filtering-searching paradigm. The main challenge is to design an Authenticated Data Structure (ADS) and aggregation algorithm that efficiently aggregates non-results for verification during the filtering-searching process. We first propose a vertex-based scheme - the novel ADS MELTree can generate candidate vertices and aggregate non-resulting vertices in the filtering phase, while the aggregation algorithm AMatching can aggregate invalid partial results in the search phase. Furthermore, we propose the bidirectional search aggregation algorithm AMatching∗ and ADS MVPTree to reduce the computational cost in the search phase and to reduce the on-chain storage cost. In addition, we propose a novel path-based scheme to enhance the aggregation of non-results and accelerate the processing. We design the path-based ADS MPETree for generating candidate paths and aggregating non-resulting paths, and the aggregation algorithm PMatching for efficiently aggregating invalid partial results one path at a time. The results of extensive experiments on five real-world graphs demonstrate the efficiency of our proposed ADSs and aggregation algorithms.
AB - Graphs serve as an essential data structure to model complex relationships in a variety of applications, such as social networks, web graphs, and chemical informatics. Due to the high cost of maintaining large-scale graph data and executing graph queries, data owners often outsource their graph data to a third-party service provider for graph processing. In this scenario, it is crucial to ensure the integrity of query results, as the provider may have the incentive to return only partial or tampered results to save computing resources or serve their own interests. Blockchain, as a promising solution for secure data storage and retrieval, opens up new opportunities for data management in such scenarios. To scale the blockchain, existing studies have concentrated on using off-chain storage while ensuring the integrity of query results for key-value data in hybrid-storage blockchain architectures. To the best of our knowledge, there is no work to enable the blockchain to support subgraph matching queries. In this paper, we first study the problem of authenticated subgraph matching queries. Traditional subgraph matching algorithms follow the filtering-searching paradigm. The main challenge is to design an Authenticated Data Structure (ADS) and aggregation algorithm that efficiently aggregates non-results for verification during the filtering-searching process. We first propose a vertex-based scheme - the novel ADS MELTree can generate candidate vertices and aggregate non-resulting vertices in the filtering phase, while the aggregation algorithm AMatching can aggregate invalid partial results in the search phase. Furthermore, we propose the bidirectional search aggregation algorithm AMatching∗ and ADS MVPTree to reduce the computational cost in the search phase and to reduce the on-chain storage cost. In addition, we propose a novel path-based scheme to enhance the aggregation of non-results and accelerate the processing. We design the path-based ADS MPETree for generating candidate paths and aggregating non-resulting paths, and the aggregation algorithm PMatching for efficiently aggregating invalid partial results one path at a time. The results of extensive experiments on five real-world graphs demonstrate the efficiency of our proposed ADSs and aggregation algorithms.
KW - hybrid-storage blockchain
KW - query authentication
KW - Subgraph matching
UR - http://www.scopus.com/pages/publications/105013599467
U2 - 10.1109/TKDE.2025.3598850
DO - 10.1109/TKDE.2025.3598850
M3 - Article
AN - SCOPUS:105013599467
SN - 1041-4347
VL - 37
SP - 6286
EP - 6303
JO - IEEE Transactions on Knowledge and Data Engineering
JF - IEEE Transactions on Knowledge and Data Engineering
IS - 11
ER -