Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses

被引:117
|
作者
Singh-Blom, U. Martin [1 ,2 ]
Natarajan, Nagarajan [3 ]
Tewari, Ambuj [4 ]
Woods, John O. [1 ]
Dhillon, Inderjit S. [3 ]
Marcotte, Edward M. [1 ,5 ]
机构
[1] Univ Texas Austin, Ctr Syst & Synthet Biol, Inst Cellular & Mol Biol, Austin, TX 78712 USA
[2] Karolinska Inst, Dept Med, Stockholm, Sweden
[3] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[4] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
[5] Univ Texas Austin, Dept Chem & Biochem, Austin, TX 78712 USA
来源
PLOS ONE | 2013年 / 8卷 / 05期
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
GENOME; DATABASE; PRIORITIZATION; IDENTIFICATION; INTEGRATION; PHENOTYPE; RESOURCE; BIOLOGY; WALKING; MODELS;
D O I
10.1371/journal.pone.0058977
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called CATAPULT (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas CATAPULT is better suited to correctly identifying gene-trait associations overall. The authors want to thank Jon Laurent and Kris McGary for some of the data used, and Li and Patra for making their code available. Most of Ambuj Tewari's contribution to this work happened while he was a postdoctoral fellow at the University of Texas at Austin.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Improving the identification of miRNA-disease associations with multi-task learning on gene-disease networks
    He, Qiang
    Qiao, Wei
    Fang, Hui
    Bao, Yang
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (04)
  • [22] NIDM: network impulsive dynamics on multiplex biological network for disease-gene prediction
    Xiang, Ju
    Zhang, Jiashuai
    Zheng, Ruiqing
    Li, Xingyi
    Li, Min
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (05)
  • [23] Recent advances in network-based methods for disease gene prediction
    Ata, Sezin Kircali
    Min Wu
    Yuan Fang
    Le Ou-Yang
    Kwoh, Chee Keong
    Li, Xiao-Li
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (04)
  • [24] Synthesis of genetic association studies for pertinent gene-disease associations requires appropriate methodological and statistical approaches
    Zintzaras, Elias
    Lau, Joseph
    JOURNAL OF CLINICAL EPIDEMIOLOGY, 2008, 61 (07) : 634 - 645
  • [25] Gene gravity-like algorithm for disease gene prediction based on phenotype-specific network
    Lin, Limei
    Yang, Tinghong
    Fang, Ling
    Yang, Jian
    Yang, Fan
    Zhao, Jing
    BMC SYSTEMS BIOLOGY, 2017, 11 : 121
  • [26] NTSMDA: prediction of miRNA-disease associations by integrating network topological similarity
    Sun, Dongdong
    Li, Ao
    Feng, Huanqing
    Wang, Minghui
    MOLECULAR BIOSYSTEMS, 2016, 12 (07) : 2224 - 2232
  • [27] Prediction of disease genes using tissue-specified gene-gene network
    Ganegoda, Gamage Upeksha
    Wang, JianXin
    Wu, Fang-Xiang
    Li, Min
    BMC SYSTEMS BIOLOGY, 2014, 8
  • [28] HerGePred: Heterogeneous Network Embedding Representation for Disease Gene Prediction
    Yang, Kuo
    Wang, Ruyu
    Liu, Guangming
    Shu, Zixin
    Wang, Ning
    Zhang, Runshun
    Yu, Jian
    Chen, Jianxin
    Li, Xiaodong
    Zhou, Xuezhong
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2019, 23 (04) : 1805 - 1815
  • [29] A Novel Disease Gene Prediction Method Based on PPI Network
    Zhao, Junmin
    He, Tingting
    Hu, Xiaohua
    Wang, Yan
    Shen, Xianjun
    Fang, Minghong
    Yuan, Jie
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [30] Gene Network Analysis of Alzheimer's Disease Based on Network and Statistical Methods
    Zhou, Chen
    Guo, Haiyan
    Cao, Shujuan
    ENTROPY, 2021, 23 (10)