A clustering-based sampling method for miRNA-disease association prediction

被引:2
作者
Wei, Zheng [1 ]
Yao, Dengju [1 ]
Zhan, Xiaojuan [1 ,2 ]
Zhang, Shuli [1 ]
机构
[1] Harbin Univ Sci & Technol, Sch Comp Sci & Technol, Harbin, Peoples R China
[2] Heilongjiang Inst Technol, Coll Comp Sci & Technol, Harbin, Peoples R China
基金
中国国家自然科学基金;
关键词
miRNA-disease association; ensemble learning; clustering; sampling; computational methods; MICRORNAS;
D O I
10.3389/fgene.2022.995535
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
More and more studies have proved that microRNAs (miRNAs) play a critical role in gene expression regulation, and the irregular expression of miRNAs tends to be associated with a variety of complex human diseases. Because of the high cost and low efficiency of identifying disease-associated miRNAs through biological experiments, scholars have focused on predicting potential disease-associated miRNAs by computational methods. Considering that the existing methods are flawed in constructing negative sample set, we proposed a clustering-based sampling method for miRNA-disease association prediction (CSMDA). Firstly, we integrated multiple similarity information of miRNA and disease to represent miRNA-disease pairs. Secondly, we performed a clustering-based sampling method to avoid introducing potential positive samples when constructing negative sample set. Thirdly, we employed a random forest-based feature selection method to reduce noise and redundant information in the high-dimensional feature space. Finally, we implemented an ensemble learning framework for predicting miRNA-disease associations by soft voting. The Precision, Recall, F1-score, AUROC and AUPR of the CSMDA achieved 0.9676, 0.9545, 0.9610, 0.9928, and 0.9940, respectively, under five-fold cross-validation. Besides, case study on three cancers showed that the top 20 potentially associated miRNAs predicted by the CSMDA were confirmed by the dbDEMC database or literatures. The above results demonstrate that the CSMDA can predict potential disease-associated miRNAs more accurately.
引用
收藏
页数:12
相关论文
共 51 条
  • [41] A graph regularized non-negative matrix factorization method for identifying microRNA-disease associations
    Xiao, Qiu
    Luo, Jiawei
    Liang, Cheng
    Cai, Jie
    Ding, Pingjian
    [J]. BIOINFORMATICS, 2018, 34 (02) : 239 - 248
  • [42] Xuan P, 2013, PLOS ONE, V8, DOI [10.1371/annotation/a076115e-dd8c-4da7-989d-c1174a8cd31e, 10.1371/journal.pone.0070204]
  • [43] DNRLMF-MDA:Predicting microRNA-Disease Associations Based on Similarities of microRNAs and Diseases
    Yan, Cheng
    Wang, Jianxin
    Ni, Peng
    Lan, Wei
    Wu, Fang-Xiang
    Pan, Yi
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2019, 16 (01) : 233 - 243
  • [44] dbDEMC 2.0: updated database of differentially expressed miRNAs in human cancers
    Yang, Zhen
    Wu, Liangcai
    Wang, Anqiang
    Tang, Wei
    Zhao, Yi
    Zhao, Haitao
    Teschendorff, Andrew E.
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D812 - D818
  • [45] An improved random forest-based computational model for predicting novel miRNA-disease associations
    Yao, Dengju
    Zhan, Xiaojuan
    Kwoh, Chee-Keong
    [J]. BMC BIOINFORMATICS, 2019, 20 (01)
  • [46] A knowledge-driven network for fine-grained relationship detection between miRNA and disease
    Yu, Shengpeng
    Wang, Hong
    Liu, Tianyu
    Liang, Cheng
    Luo, Jiawei
    [J]. BRIEFINGS IN BIOINFORMATICS, 2022, 23 (03)
  • [47] Automated classification of clinical trial eligibility criteria text based on ensemble learning and metric learning
    Zeng, Kun
    Xu, Yibin
    Lin, Ge
    Liang, Likeng
    Hao, Tianyong
    [J]. BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (SUPPL 2)
  • [48] Prediction of potential disease-associated microRNAs using structural perturbation method
    Zeng, Xiangxiang
    Liu, Li
    Lu, Linyuan
    Zou, Quan
    [J]. BIOINFORMATICS, 2018, 34 (14) : 2425 - 2432
  • [49] Adaptive boosting-based computational model for predicting potential miRNA-disease associations
    Zhao, Yan
    Chen, Xing
    Yin, Jun
    [J]. BIOINFORMATICS, 2019, 35 (22) : 4730 - 4738
  • [50] A non-negative matrix factorization based method for predicting disease-associated miRNAs in miRNA-disease bilayer network
    Zhong, Yingli
    Xuan, Ping
    Wang, Xiao
    Zhang, Tiangang
    Li, Jianzhong
    Liu, Yong
    Zhang, Weixiong
    [J]. BIOINFORMATICS, 2018, 34 (02) : 267 - 277