Prediction and Validation of Gene-Disease Associations Using Methods Inspired by Social Network Analyses

被引:117
|
作者
Singh-Blom, U. Martin [1 ,2 ]
Natarajan, Nagarajan [3 ]
Tewari, Ambuj [4 ]
Woods, John O. [1 ]
Dhillon, Inderjit S. [3 ]
Marcotte, Edward M. [1 ,5 ]
机构
[1] Univ Texas Austin, Ctr Syst & Synthet Biol, Inst Cellular & Mol Biol, Austin, TX 78712 USA
[2] Karolinska Inst, Dept Med, Stockholm, Sweden
[3] Univ Texas Austin, Dept Comp Sci, Austin, TX 78712 USA
[4] Univ Michigan, Dept Stat, Ann Arbor, MI 48109 USA
[5] Univ Texas Austin, Dept Chem & Biochem, Austin, TX 78712 USA
来源
PLOS ONE | 2013年 / 8卷 / 05期
基金
美国国家科学基金会; 美国国家卫生研究院;
关键词
GENOME; DATABASE; PRIORITIZATION; IDENTIFICATION; INTEGRATION; PHENOTYPE; RESOURCE; BIOLOGY; WALKING; MODELS;
D O I
10.1371/journal.pone.0058977
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Correctly identifying associations of genes with diseases has long been a goal in biology. With the emergence of large-scale gene-phenotype association datasets in biology, we can leverage statistical and machine learning methods to help us achieve this goal. In this paper, we present two methods for predicting gene-disease associations based on functional gene associations and gene-phenotype associations in model organisms. The first method, the Katz measure, is motivated from its success in social network link prediction, and is very closely related to some of the recent methods proposed for gene-disease association inference. The second method, called CATAPULT (Combining dATa Across species using Positive-Unlabeled Learning Techniques), is a supervised machine learning method that uses a biased support vector machine where the features are derived from walks in a heterogeneous gene-trait network. We study the performance of the proposed methods and related state-of-the-art methods using two different evaluation strategies, on two distinct data sets, namely OMIM phenotypes and drug-target interactions. Finally, by measuring the performance of the methods using two different evaluation strategies, we show that even though both methods perform very well, the Katz measure is better at identifying associations between traits and poorly studied genes, whereas CATAPULT is better suited to correctly identifying gene-trait associations overall. The authors want to thank Jon Laurent and Kris McGary for some of the data used, and Li and Patra for making their code available. Most of Ambuj Tewari's contribution to this work happened while he was a postdoctoral fellow at the University of Texas at Austin.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Disease Gene Prioritization Using Network and Feature
    Xie, Bingqing
    Agam, Gady
    Balasubramanian, Sandhya
    Xu, Jinbo
    Gilliam, T. Conrad
    Maltsev, Natalia
    Boernigen, Daniela
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2015, 22 (04) : 313 - 323
  • [32] Network-based methods for gene function prediction
    Chen, Qingfeng
    Li, Yongjie
    Tan, Kai
    Qiao, Yvlu
    Pan, Shirui
    Jiang, Taijiao
    Chen, Yi-Ping Phoebe
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2021, 20 (04) : 249 - 257
  • [33] Clinical exome sequencing for cerebellar ataxia and spastic paraplegia uncovers novel gene-disease associations and unanticipated rare disorders
    De Warrenburg, Bart P. van
    Schouten, Meyke I.
    de Bot, Susanne T.
    Vermeer, Sascha
    Meijer, Rowdy
    Pennings, Maartje
    Gilissen, Christian
    Willemsen, Michel A. A. P.
    Scheffer, Hans
    Kamsteeg, Erik-Jan
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2016, 24 (10) : 1460 - 1466
  • [34] NELDA: Prediction of LncRNA-disease Associations With Network Embedding
    Li Wei-Na
    Fan Xiao-Nan
    Zhang Shao-Wu
    PROGRESS IN BIOCHEMISTRY AND BIOPHYSICS, 2022, 49 (07) : 1369 - 1380
  • [35] A clinical laboratory's experience using GeneMatcher-Building stronger gene-disease relationships
    Taylor, Julie P.
    Malhotra, Alka
    Burns, Nicole J.
    Clause, Amanda R.
    Brown, Carolyn M.
    Burns, Brendan T.
    Chandrasekhar, Anjana
    Schlachetzki, Zinayida
    Bennett, Maren
    Thorpe, Erin
    Taft, Ryan J.
    Perry, Denise L.
    Coffey, Alison J.
    HUMAN MUTATION, 2022, 43 (06) : 765 - 771
  • [36] Protein Function Prediction Using Function Associations in Protein-Protein Interaction Network
    Sun, Pingping
    Tan, Xian
    Guo, Sijia
    Zhang, Jingbo
    Sun, Bojian
    Du, Ning
    Wang, Han
    Sun, Hui
    IEEE ACCESS, 2018, 6 : 30892 - 30902
  • [37] GANLDA: Graph attention network for lncRNA-disease associations prediction
    Lan, Wei
    Wu, Ximin
    Chen, Qingfeng
    Peng, Wei
    Wang, Jianxin
    Chen, Yiping Phoebe
    NEUROCOMPUTING, 2022, 469 : 384 - 393
  • [38] Prediction of LncRNA-Disease Associations Based on Network Consistency Projection
    Li, Guanghui
    Luo, Jiawei
    Liang, Cheng
    Xiao, Qiu
    Ding, Pingjian
    Zhang, Yuejin
    IEEE ACCESS, 2019, 7 : 58849 - 58856
  • [39] Enhancing the prediction of disease-gene associations with multimodal deep learning
    Luo, Ping
    Li, Yuanyuan
    Tian, Li-Ping
    Wu, Fang-Xiang
    BIOINFORMATICS, 2019, 35 (19) : 3735 - 3742
  • [40] Network-Based Approaches for Disease-Gene Association Prediction Using Protein-Protein Interaction Networks
    Kim, Yoonbee
    Park, Jong-Hoon
    Cho, Young-Rae
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (13)