WAFNRLTG: A Novel Model for Predicting LncRNA Target Genes Based on Weighted Average Fusion Network Representation Learning Method

被引:2
作者
Li, Jianwei [1 ,2 ]
Yang, Zhenwu [1 ]
Wang, Duanyang [1 ]
Li, Zhiguang [1 ]
机构
[1] Hebei Univ Technol, Inst Computat Med, Sch Artificial Intelligence, Tianjin, Peoples R China
[2] Hebei Univ Technol, Hebei Prov Key Lab Big Data Calculat, Tianjin, Peoples R China
来源
FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY | 2022年 / 9卷
基金
中国国家自然科学基金;
关键词
lncRNA target genes prediction; weighted average fusion network representation learning; heterogeneous network; machine learning; XGBoost; LONG NONCODING RNAS; DISEASE; EXPRESSION; DATABASE; CERNA;
D O I
10.3389/fcell.2021.820342
中图分类号
Q2 [细胞生物学];
学科分类号
071009 ; 090102 ;
摘要
Long non-coding RNAs (lncRNAs) do not encode proteins, yet they have been well established to be involved in complex regulatory functions, and lncRNA regulatory dysfunction can lead to a variety of human complex diseases. LncRNAs mostly exert their functions by regulating the expressions of target genes, and accurate prediction of potential lncRNA target genes would be helpful to further understanding the functional annotations of lncRNAs. Considering the limitations in traditional computational methods for predicting lncRNA target genes, a novel model which was named Weighted Average Fusion Network Representation learning for predicting LncRNA Target Genes (WAFNRLTG) was proposed. First, a novel heterogeneous network was constructed by integrating lncRNA sequence similarity network, mRNA sequence similarity network, lncRNA-mRNA interaction network, lncRNA-miRNA interaction network and mRNA-miRNA interaction network. Next, four popular network representation learning methods were utilized to gain the representation vectors of lncRNA and mRNA nodes. Then, the representations of lncRNAs and target genes in the heterogeneous network were obtained with the weighted average fusion network representation learning method. Finally, we merged the representations of lncRNAs and related target genes to form lncRNA-gene pairs, trained the XGBoost classifier and predicted potential lncRNA target genes. In five-cross validations on the training and independent datasets, the experimental results demonstrated that WAFNRLTG obtained better AUC scores (0.9410, 0.9350) and AUPR scores (0.9391, 0.9350). Moreover, case studies of three common lncRNAs were performed for predicting their potential lncRNA target genes and the results confirmed the effectiveness of WAFNRLTG. The source codes and all data of WAFNRLTG can be freely downloaded at https://github.com/HGDYZW/WAFNRLTG.
引用
收藏
页数:11
相关论文
共 44 条
  • [1] Cao S, 2015, P 24 ACM INT C INF K, P891
  • [2] Noncoding RNA transcription beyond annotated genes
    Carninci, Piero
    Hayashizaki, Yoshihide
    [J]. CURRENT OPINION IN GENETICS & DEVELOPMENT, 2007, 17 (02) : 139 - 144
  • [3] XGBoost: A Scalable Tree Boosting System
    Chen, Tianqi
    Guestrin, Carlos
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 785 - 794
  • [4] Novel human lncRNA-disease association inference based on lncRNA expression profiles
    Chen, Xing
    Yan, Gui-Ying
    [J]. BIOINFORMATICS, 2013, 29 (20) : 2617 - 2624
  • [5] LDAH2V: Exploring Meta-Paths Across Multiple Networks for lncRNA-Disease Association Prediction
    Deng, Lei
    Li, Wenkai
    Zhang, Jingpu
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2021, 18 (04) : 1572 - 1581
  • [6] Expression of a noncoding RNA is elevated in Alzheimer's disease and drives rapid feed-forward regulation of β-secretase
    Faghihi, Mohammad Ali
    Modarresi, Farzaneh
    Khalil, Ahmad M.
    Wood, Douglas E.
    Sahagan, Barbara G.
    Morgan, Todd E.
    Finch, Caleb E.
    Laurent, Georges St., III
    Kenny, Paul J.
    Wahlestedt, Claes
    [J]. NATURE MEDICINE, 2008, 14 (07) : 723 - 730
  • [7] A decision-theoretic generalization of on-line learning and an application to boosting
    Freund, Y
    Schapire, RE
    [J]. JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 1997, 55 (01) : 119 - 139
  • [8] RISE: a database of RNA interactome from sequencing experiments
    Gong, Jing
    Shao, Di
    Xu, Kui
    Lu, Zhipeng
    Lu, Zhi John
    Yang, Yucheng T.
    Zhang, Qiangfeng Cliff
    [J]. NUCLEIC ACIDS RESEARCH, 2018, 46 (D1) : D194 - D201
  • [9] Long Noncoding RNA PVT1 Promotes Stemness and Temozolomide Resistance through miR-365/ELF4/SOX2 Axis in Glioma
    Gong, Rui
    Li, Zhi-Qiang
    Fu, Kai
    Ma, Chao
    Wang, Wei
    Chen, Jin-Cao
    [J]. EXPERIMENTAL NEUROBIOLOGY, 2021, 30 (03) : 244 - 255
  • [10] node2vec: Scalable Feature Learning for Networks
    Grover, Aditya
    Leskovec, Jure
    [J]. KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, : 855 - 864