Co-clustering phenome-genome for phenotype classification and disease gene discovery

被引:53
|
作者
Hwang, TaeHyun [2 ]
Atluri, Gowtham [1 ]
Xie, MaoQiang [3 ]
Dey, Sanjoy [1 ]
Hong, Changjin [4 ]
Kumar, Vipin [1 ]
Kuang, Rui [1 ]
机构
[1] Univ Minnesota Twin Cities, Dept Comp Sci & Engn, Minneapolis, MN 55455 USA
[2] Univ Minnesota Twin Cities, Masonic Med Ctr, Bioinformat Core, Minneapolis, MN 55455 USA
[3] Nankai Univ, Coll Software, Tianjin 300071, Peoples R China
[4] Boston Univ, Computat Biomed Div, Dept Med, Boston, MA 02118 USA
基金
美国国家科学基金会;
关键词
COLORECTAL-CANCER; EXPRESSION PROFILES; ALZHEIMERS-DISEASE; EXO1; GENE; MUTATIONS; DIAGNOSIS; RESOURCE; PATHWAYS; FEATURES; NETWORK;
D O I
10.1093/nar/gks615
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Understanding the categorization of human diseases is critical for reliably identifying disease causal genes. Recently, genome-wide studies of abnormal chromosomal locations related to diseases have mapped > 2000 phenotype-gene relations, which provide valuable information for classifying diseases and identifying candidate genes as drug targets. In this article, a regularized non-negative matrix tri-factorization (R-NMTF) algorithm is introduced to co-cluster phenotypes and genes, and simultaneously detect associations between the detected phenotype clusters and gene clusters. The R-NMTF algorithm factorizes the phenotype-gene association matrix under the prior knowledge from phenotype similarity network and protein-protein interaction network, supervised by the label information from known disease classes and biological pathways. In the experiments on disease phenotype-gene associations in OMIM and KEGG disease pathways, R-NMTF significantly improved the classification of disease phenotypes and disease pathway genes compared with support vector machines and Label Propagation in cross-validation on the annotated phenotypes and genes. The newly predicted phenotypes in each disease class are highly consistent with human phenotype ontology annotations. The roles of the new member genes in the disease pathways are examined and validated in the protein-protein interaction subnetworks. Extensive literature review also confirmed many new members of the disease classes and pathways as well as the predicted associations between disease phenotype classes and pathways.
引用
收藏
页数:16
相关论文
共 5 条
  • [1] Co-clustering of Diseases, Genes, and Drugs for Identification of Their Related Gene Modules
    Koohi, Arezou
    Homayoun, Houman
    Xu, Jie
    Orooji, Mahdi
    2016 EIGHTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2016, : 407 - 411
  • [2] Phenome-based gene discovery provides information about Parkinson's disease drug targets
    Chen, Yang
    Xu, Rong
    BMC GENOMICS, 2016, 17
  • [3] ConGEMs: Condensed Gene Co-Expression Module Discovery Through Rule-Based Clustering and Its Application to Carcinogenesis
    Mallik, Saurav
    Zhao, Zhongming
    GENES, 2018, 9 (01):
  • [4] Genome-wide analysis of cis-regulatory element structure and discovery of motif-driven gene co-expression networks in grapevine
    Wong, Darren Chern Jan
    Gutierrez, Rodrigo Lopez
    Gambetta, Gregory Alan
    Castellarin, Simone Diego
    DNA RESEARCH, 2017, 24 (03) : 311 - 326
  • [5] Gene co-expression in the interactome: moving from correlation toward causation via an integrated approach to disease module discovery
    Paci, Paola
    Fiscon, Giulia
    Conte, Federica
    Wang, Rui-Sheng
    Farina, Lorenzo
    Loscalzo, Joseph
    NPJ SYSTEMS BIOLOGY AND APPLICATIONS, 2021, 7 (01)