Fusion of Large-Scale Genomic Knowledge and Frequency Data Computationally Prioritizes Variants in Epilepsy

被引:20
作者
Campbell, Ian M. [1 ]
Rao, Mitchell [1 ]
Arredondo, Sean D. [2 ]
Lalani, Seema R. [1 ,3 ]
Xia, Zhilian [1 ]
Kang, Sung-Hae L. [1 ]
Bi, Weimin [1 ]
Breman, Amy M. [1 ]
Smith, Janice L. [1 ]
Bacino, Carlos A. [1 ,3 ]
Beaudet, Arthur L. [1 ,3 ,4 ]
Patel, Ankita [1 ]
Cheung, Sau Wai [1 ]
Lupski, James R. [1 ,3 ,4 ]
Stankiewicz, Pawel [1 ]
Ramocki, Melissa B. [3 ,5 ]
Shaw, Chad A. [1 ]
机构
[1] Baylor Coll Med, Dept Mol & Human Genet, Houston, TX 77030 USA
[2] Baylor Coll Med, Houston, TX 77030 USA
[3] Texas Childrens Hosp, Houston, TX 77030 USA
[4] Baylor Coll Med, Dept Pediat, Houston, TX 77030 USA
[5] Baylor Coll Med, Dept Pediat, Sect Pediat Neurol & Dev Neurosci, Houston, TX 77030 USA
来源
PLOS GENETICS | 2013年 / 9卷 / 09期
关键词
RECURRENT MICRODELETIONS; 16P13.11; PREDISPOSE; GENE; REVEALS; 16P11.2; EPIDEMIOLOGY; INTERACTOME; SPECTRUM; MOUSE; RISK;
D O I
10.1371/journal.pgen.1003797
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Curation and interpretation of copy number variants identified by genome-wide testing is challenged by the large number of events harbored in each personal genome. Conventional determination of phenotypic relevance relies on patterns of higher frequency in affected individuals versus controls; however, an increasing amount of ascertained variation is rare or private to clans. Consequently, frequency data have less utility to resolve pathogenic from benign. One solution is disease-specific algorithms that leverage gene knowledge together with variant frequency to aid prioritization. We used large-scale resources including Gene Ontology, protein-protein interactions and other annotation systems together with a broad set of 83 genes with known associations to epilepsy to construct a pathogenicity score for the phenotype. We evaluated the score for all annotated human genes and applied Bayesian methods to combine the derived pathogenicity score with frequency information from our diagnostic laboratory. Analysis determined Bayes factors and posterior distributions for each gene. We applied our method to subjects with abnormal chromosomal microarray results and confirmed epilepsy diagnoses gathered by electronic medical record review. Genes deleted in our subjects with epilepsy had significantly higher pathogenicity scores and Bayes factors compared to subjects referred for non-neurologic indications. We also applied our scores to identify a recently validated epilepsy gene in a complex genomic region and to reveal candidate genes for epilepsy. We propose a potential use in clinical decision support for our results in the context of genome-wide screening. Our approach demonstrates the utility of integrative data in medical genomics.
引用
收藏
页数:15
相关论文
共 50 条
  • [41] Misspecified poisson regression models for large-scale registry data: inference for 'large n and small p'
    Gron, Randi
    Gerds, Thomas A.
    Andersen, Per K.
    STATISTICS IN MEDICINE, 2016, 35 (07) : 1117 - 1129
  • [42] Intervention for First Graders With Limited Number Knowledge: Large-Scale Replication of a Randomized Controlled Trial
    Gersten, Russell
    Rolfhus, Eric
    Clarke, Ben
    Decker, Lauren E.
    Wilkins, Chuck
    Dimino, Joseph
    AMERICAN EDUCATIONAL RESEARCH JOURNAL, 2015, 52 (03) : 516 - 546
  • [43] ARCH: Large-scale knowledge graph via aggregated narrative codified health records analysis
    Gan, Ziming
    Zhou, Doudou
    Rush, Everett
    Panickan, Vidul A.
    Hoe, Yuk-Lam
    Ostrouchovm, George
    Xu, Zhiwei
    Shen, Shuting
    Xiong, Xin
    Greco, Kimberly F.
    Hong, Chuan
    Bonzel, Clara-Lea
    Wend, Jun
    Costa, Lauren
    Cai, Tianrun
    Begoli, Edmon
    Xiaj, Zongqi
    Gaziano, J. Michael
    Liao, Katherine P.
    Cho, Kelly
    Cai, Tianxi
    Lu, Junwei
    JOURNAL OF BIOMEDICAL INFORMATICS, 2025, 162
  • [44] The integration of large-scale public data and network analysis uncovers molecular characteristics of psoriasis
    Federico, Antonio
    Pavel, Alisa
    Mobus, Lena
    McKean, David
    Del Giudice, Giusy
    Fortino, Vittorio
    Niehues, Hanna
    Rastrick, Joe
    Eyerich, Kilian
    Eyerich, Stefanie
    van den Bogaard, Ellen
    Smith, Catherine
    Weidinger, Stephan
    de Rinaldis, Emanuele
    Greco, Dario
    HUMAN GENOMICS, 2022, 16 (01)
  • [45] A large-scale benchmark for network inference from single-cell perturbation data
    Chevalley, Mathieu
    Roohani, Yusuf H.
    Mehrjou, Arash
    Leskovec, Jure
    Schwab, Patrick
    COMMUNICATIONS BIOLOGY, 2025, 8 (01)
  • [46] A Large-Scale Study Indicates Increase in the Risk of Epilepsy in Patients With Different Risk Factors, Including Rheumatoid Arthritis
    Chang, Kuang-Hsi
    Hsu, Yi-Chao
    Chang, Mei-Yin
    Lin, Cheng-Li
    Wu, Trong-Neng
    Hwang, Bing-Fang
    Chen, Chiu-Ying
    Liu, Hui-Chuan
    Kao, Chia-Hung
    MEDICINE, 2015, 94 (36)
  • [47] Inferring Physical Protein Contacts from Large-Scale Purification Data of Protein Complexes
    Schelhorn, Sven-Eric
    Mestre, Julian
    Albrecht, Mario
    Zotenko, Elena
    MOLECULAR & CELLULAR PROTEOMICS, 2011, 10 (06)
  • [48] Chapeau UK Biobank! A revolution for integrated research on humans and large-scale data sharing
    Bourgeron, Thomas
    COMPTES RENDUS BIOLOGIES, 2022, 345 (01) : 7 - 10
  • [49] Identification of metagenes and their Interactions through Large-scale Analysis of Arabidopsis Gene Expression Data
    Wilson, Tyler J.
    Lai, Liming
    Ban, Yuguang
    Ge, Steven X.
    BMC GENOMICS, 2012, 13
  • [50] Prevalence of colonoscopy in Japan using a large-scale health claims data compared to esophagogastroduodenoscopy
    Yoshida, Naohisa
    Maeda-Minami, Ayako
    Ishikawa, Hideki
    Mutoh, Michihiro
    Tomita, Yuri
    Kobayashi, Reo
    Hashimoto, Hikaru
    Inoue, Ken
    Hirose, Ryohei
    Dohi, Osamu
    Itoh, Yoshito
    Mano, Yasunari
    JOURNAL OF GASTROENTEROLOGY, 2024, 59 (06) : 457 - 467