HetFS: a method for fast similarity search with ad-hoc meta-paths on heterogeneous information networks

被引:1
作者
Mao, Xuqi [1 ,3 ]
Chen, Zhenyi [2 ,3 ]
He, Zhenying [1 ,3 ]
Jing, Yinan [1 ,3 ]
Zhang, Kai [1 ,3 ]
Wang, X. Sean [1 ,2 ,3 ]
机构
[1] Fudan Univ, Sch Comp Sci, Shanghai, Peoples R China
[2] Fudan Univ, Software Sch, Shanghai, Peoples R China
[3] Shanghai Key Lab Data Sci, Shanghai, Peoples R China
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2024年 / 27卷 / 06期
基金
中国国家自然科学基金;
关键词
Heterogeneous information network; Similarity search method; User-given meta-path; Ad-hoc query;
D O I
10.1007/s11280-024-01303-1
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Numerous real-world information networks form Heterogeneous Information Networks (HINs) with diverse objects and relations represented as nodes and edges in heterogeneous graphs. Similarity between nodes quantifies how closely two nodes resemble each other, mainly depending on the similarity of the nodes they are connected to, recursively. Users may be interested in only specific types of connections in the similarity definition, represented as meta-paths, i.e., a sequence of node and edge types. Existing Heterogeneous Graph Neural Network (HGNN)-based similarity search methods may accommodate meta-paths, but require retraining for different meta-paths. Conversely, existing path-based similarity search methods may switch flexibly between meta-paths but often suffer from lower accuracy, as they rely solely on path information. This paper proposes HetFS, a Fast Similarity method for ad-hoc queries with user-given meta-paths on Heterogeneous information networks. HetFS provides similarity results based on path information that satisfies the meta-path restriction, as well as node content. Extensive experiments demonstrate the effectiveness and efficiency of HetFS in addressing ad-hoc queries, outperforming state-of-the-art HGNNs and path-based approaches, and showing strong performance in downstream applications, including link prediction, node classification, and clustering.
引用
收藏
页数:24
相关论文
共 39 条
[1]  
Chen Liangliang, 2023, Database Systems for Advanced Applications: 28th International Conference, DASFAA 2023, Proceedings. Lecture Notes in Computer Science (13944), P38, DOI 10.1007/978-3-031-30672-3_3
[2]   P-Simrank: Extending Simrank to Scale-Free Bipartite Networks [J].
Dey, Prasenjit ;
Goel, Kunal ;
Agrawal, Rahul .
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, :3084-3090
[3]   metapath2vec: Scalable Representation Learning for Heterogeneous Networks [J].
Dong, Yuxiao ;
Chawla, Nitesh V. ;
Swami, Ananthram .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :135-144
[4]  
Echihabi K, 2022, Arxiv, DOI arXiv:2212.13297
[5]   MAGNN: Metapath Aggregated Graph Neural Network for Heterogeneous Graph Embedding [J].
Fu, Xinyu ;
Zhang, Jiani ;
Men, Ziqiao ;
King, Irwin .
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, :2331-2341
[6]  
Giannopoulos G., 2019, ADV DATABASE TECHNOL, V2019, P477
[7]   node2vec: Scalable Feature Learning for Networks [J].
Grover, Aditya ;
Leskovec, Jure .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :855-864
[8]   Heterogeneous Graph Transformer [J].
Hu, Ziniu ;
Dong, Yuxiao ;
Wang, Kuansan ;
Sun, Yizhou .
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020), 2020, :2704-2710
[9]  
Jeh G., 2002, KDD 02, P538, DOI DOI 10.1145/775047
[10]   Metapath-aggregated heterogeneous graph neural network for drug-target interaction prediction [J].
Li, Mei ;
Cai, Xiangrui ;
Xu, Sihan ;
Ji, Hua .
BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)