Predicting lncRNA-disease associations using multiple metapaths in hierarchical graph attention networks

被引:2
作者
Yao, Dengju [1 ]
Deng, Yuexiao [1 ]
Zhan, Xiaojuan [1 ,2 ]
Zhan, Xiaorong [3 ]
机构
[1] Harbin Univ Sci & Technol, Sch Comp Sci & Technol, Harbin 150080, Peoples R China
[2] Heilongjiang Inst Technol, Coll Comp Sci & Technol, Harbin 150050, Peoples R China
[3] Univ Sci & Technol, Hosp South, Dept Endocrinol & Metab, Shenzhen 518055, Peoples R China
基金
中国国家自然科学基金;
关键词
Metapaths; Heterogeneous graph; Multihead attention mechanism; lncRNA-disease associations; SIMILARITY; DATABASE;
D O I
10.1186/s12859-024-05672-2
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundMany biological studies have shown that lncRNAs regulate the expression of epigenetically related genes. The study of lncRNAs has helped to deepen our understanding of the pathogenesis of complex diseases at the molecular level. Due to the large number of lncRNAs and the complex and time-consuming nature of biological experiments, applying computer techniques to predict potential lncRNA-disease associations is very effective. To explore information between complex network structures, existing methods rely mainly on lncRNA and disease information. Metapaths have been applied to network models as an effective method for exploring information in heterogeneous graphs. However, existing methods are dominated by lncRNAs or disease nodes and tend to ignore the paths provided by intermediate nodes.MethodsWe propose a deep learning model based on hierarchical graphical attention networks to predict unknown lncRNA-disease associations using multiple types of metapaths to extract features. We have named this model the MMHGAN. First, the model constructs a lncRNA-disease-miRNA heterogeneous graph based on known associations and two homogeneous graphs of lncRNAs and diseases. Second, for homogeneous graphs, the features of neighboring nodes are aggregated using a multihead attention mechanism. Third, for the heterogeneous graph, metapaths of different intermediate nodes are selected to construct subgraphs, and the importance of different types of metapaths is calculated and aggregated to obtain the final embedded features. Finally, the features are reconstructed using a fully connected layer to obtain the prediction results.ResultsWe used a fivefold cross-validation method and obtained an average AUC value of 96.07% and an average AUPR value of 93.23%. Additionally, ablation experiments demonstrated the role of homogeneous graphs and different intermediate node path weights. In addition, we studied lung cancer, esophageal carcinoma, and breast cancer. Among the 15 lncRNAs associated with these diseases, 15, 12, and 14 lncRNAs were validated by the lncRNA Disease Database and the Lnc2Cancer Database, respectively.ConclusionWe compared the MMHGAN model with six existing models with better performance, and the case study demonstrated that the model was effective in predicting the correlation between potential lncRNAs and diseases.
引用
收藏
页数:23
相关论文
共 38 条
[1]   LncRNADisease 2.0: an updated database of long non-coding RNA-associated diseases [J].
Bao, Zhenyu ;
Yang, Zhen ;
Huang, Zhou ;
Zhou, Yiran ;
Cui, Qinghua ;
Dong, Dong .
NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) :D1034-D1037
[2]   LncRNADisease: a database for long-non-coding RNA-associated diseases [J].
Chen, Geng ;
Wang, Ziyun ;
Wang, Dongqing ;
Qiu, Chengxiang ;
Liu, Mingxi ;
Chen, Xing ;
Zhang, Qipeng ;
Yan, Guiying ;
Cui, Qinghua .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D983-D986
[3]   Novel human lncRNA-disease association inference based on lncRNA expression profiles [J].
Chen, Xing ;
Yan, Gui-Ying .
BIOINFORMATICS, 2013, 29 (20) :2617-2624
[4]   GCRFLDA: scoring lncRNA-disease associations using graph convolution matrix completion with conditional random field [J].
Fan, Yongxian ;
Chen, Meijun ;
Pan, Xiaoyong .
BRIEFINGS IN BIOINFORMATICS, 2022, 23 (01)
[5]   Matrix factorization-based data fusion for the prediction of lncRNA-disease associations [J].
Fu, Guangyuan ;
Wang, Jun ;
Domeniconi, Carlotta ;
Yu, Guoxian .
BIOINFORMATICS, 2018, 34 (09) :1529-1537
[6]   Lnc2Cancer 3.0: an updated resource for experimentally supported lncRNA/circRNA cancer associations and web tools based on RNA-seq and scRNA-seq data [J].
Gao, Yue ;
Shang, Shipeng ;
Guo, Shuang ;
Li, Xin ;
Zhou, Hanxiao ;
Liu, Hongjia ;
Sun, Yue ;
Wang, Junwei ;
Wang, Peng ;
Zhi, Hui ;
Li, Xia ;
Ning, Shangwei ;
Zhang, Yunpeng .
NUCLEIC ACIDS RESEARCH, 2021, 49 (D1) :D1251-D1258
[7]   HOPEXGB: A Consensual Model for Predicting miRNA/lncRNA-Disease Associations Using a Heterogeneous Disease-miRNA-lncRNA Information Network [J].
He, Jian ;
Li, Menglong ;
Qiu, Jiangguo ;
Pu, Xuemei ;
Guo, Yanzhi .
JOURNAL OF CHEMICAL INFORMATION AND MODELING, 2023, 64 (07) :2863-2877
[8]   Learning Multimodal Networks From Heterogeneous Data for Prediction of lncRNA-miRNA Interactions [J].
Hu, Pengwei ;
Huang, Yu-An ;
Chan, Keith C. C. ;
You, Zhu-Hong .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 17 (05) :1516-1524
[9]   LDAP: a web server for lncRNA-disease association prediction [J].
Lan, Wei ;
Li, Min ;
Zhao, Kaijie ;
Liu, Jin ;
Wu, Fang-Xiang ;
Pan, Yi ;
Wang, Jianxin .
BIOINFORMATICS, 2017, 33 (03) :458-460
[10]   Prediction of LncRNA-Disease Associations Based on Network Consistency Projection [J].
Li, Guanghui ;
Luo, Jiawei ;
Liang, Cheng ;
Xiao, Qiu ;
Ding, Pingjian ;
Zhang, Yuejin .
IEEE ACCESS, 2019, 7 :58849-58856