Deep Phenotyping on Electronic Health Records Facilitates Genetic Diagnosis by Clinical Exomes

被引:83
作者
Son, Jung Hoon [1 ]
Xie, Gangcai [1 ,2 ,3 ]
Yuan, Chi [1 ]
Ena, Lyudmila [1 ]
Li, Ziran [1 ]
Goldstein, Andrew [1 ]
Huang, Lulin [2 ,3 ]
Wang, Liwei [4 ]
Shen, Feichen [4 ]
Liu, Hongfang [4 ]
Mehl, Karla [5 ]
Groopman, Emily E. [5 ]
Marasa, Maddalena [5 ]
Kiryluk, Krzysztof [5 ]
Gharavi, Ali G. [5 ]
Chung, Wendy K. [6 ]
Hripcsak, George [1 ]
Friedman, Carol [1 ]
Weng, Chunhua [1 ]
Wang, Kai [1 ,2 ,3 ,7 ,8 ]
机构
[1] Columbia Univ, Dept Biomed Informat, New York, NY 10032 USA
[2] Columbia Univ, Inst Genom Med, New York, NY 10032 USA
[3] Childrens Hosp Philadelphia, Raymond G Perelman Ctr Cellular & Mol Therapeut, Philadelphia, PA 19104 USA
[4] Mayo Clin, Div Biomed Stat & Informat, Rochester, MN 55901 USA
[5] Columbia Univ, Dept Med, Div Nephrol, New York, NY 10032 USA
[6] Columbia Univ, Dept Pediat & Med, New York, NY 10032 USA
[7] Childrens Hosp Philadelphia, Dept Biomed & Hlth Informat, Philadelphia, PA 19104 USA
[8] Univ Penn, Dept Pathol & Lab Med, Perelman Sch Med, Philadelphia, PA 19104 USA
关键词
CONGENITAL HEART-DISEASE; PRIORITIZATION; IDENTIFICATION; CHALLENGES;
D O I
10.1016/j.ajhg.2018.05.010
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Integration of detailed phenotype information with genetic data is well established to facilitate accurate diagnosis of hereditary disorders. As a rich source of phenotype information, electronic health records (EHRs) promise to empower diagnostic variant interpretation. However, how to accurately and efficiently extract phenotypes from heterogeneous EHR narratives remains a challenge. Here, we present EHR-Phenolyzer, a high-throughput EHR framework for extracting and analyzing phenotypes. EHR-Phenolyzer extracts and normalizes Human Phenotype Ontology (HPO) concepts from EHR narratives and then prioritizes genes with causal variants on the basis of the HPO-coded phenotype manifestations. We assessed EHR-Phenolyzer on 28 pediatric individuals with confirmed diagnoses of monogenic diseases and found that the genes with causal variants were ranked among the top 100 genes selected by EHR-Phenolyzer for 16/28 individuals (p < 2.2 x 10(-16)), supporting the value of phenotype-driven gene prioritization in diagnostic sequence interpretation. To assess the generalizability, we replicated this finding on an independent EHR dataset of ten individuals with a positive diagnosis from a different institution. We then assessed the broader utility by examining two additional EHR datasets, including 31 individuals who were suspected of having a Mendelian disease and underwent different types of genetic testing and 20 individuals with positive diagnoses of specific Mendelian etiologies of chronic kidney disease from exome sequencing. Finally, through several retrospective case studies, we demonstrated how combined analyses of genotype data and deep phenotype data from EHRs can expedite genetic diagnoses. In summary, EHR-Phenolyzer leverages EHR narratives to automate phenotype-driven analysis of clinical exomes or genomes, facilitating the broader implementation of genomic medicine.
引用
收藏
页码:58 / 73
页数:16
相关论文
共 47 条
  • [1] Gene prioritization through genomic data fusion
    Aerts, S
    Lambrechts, D
    Maity, S
    Van Loo, P
    Coessens, B
    De Smet, F
    Tranchevent, LC
    De Moor, B
    Marynen, P
    Hassan, B
    Carmeliet, P
    Moreau, Y
    [J]. NATURE BIOTECHNOLOGY, 2006, 24 (05) : 537 - 544
  • [2] An overview of MetaMap: historical perspective and recent advances
    Aronson, Alan R.
    Lang, Francois-Michel
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) : 229 - 236
  • [3] Aronson AR, 2001, J AM MED INFORM ASSN, P17
  • [4] PhenomeCentral: A Portal for Phenotypic and Genotypic Matchmaking of Patients with Rare Genetic Diseases
    Buske, Orion J.
    Girdea, Marta
    Dumitriu, Sergiu
    Gallinger, Bailey
    Hartley, Taila
    Trang, Heather
    Misyura, Andriy
    Friedman, Tal
    Beaulieu, Chandree
    Bone, William P.
    Links, Amanda E.
    Washington, Nicole L.
    Haendel, Melissa A.
    Robinson, Peter N.
    Boerkoel, Cornelius F.
    Adams, David
    Gahl, William A.
    Boycott, Kym M.
    Brudno, Michael
    [J]. HUMAN MUTATION, 2015, 36 (10) : 931 - 940
  • [5] Finding Our Way through Phenotypes
    Deans, Andrew R.
    Lewis, Suzanna E.
    Huala, Eva
    Anzaldo, Salvatore S.
    Ashburner, Michael
    Balhoff, James P.
    Blackburn, David C.
    Blake, Judith A.
    Burleigh, J. Gordon
    Chanet, Bruno
    Cooper, Laurel D.
    Courtot, Melanie
    Csoesz, Sandor
    Cui, Hong
    Dahdul, Wasila
    Das, Sandip
    Dececchi, T. Alexander
    Dettai, Agnes
    Diogo, Rui
    Druzinsky, Robert E.
    Dumontier, Michel
    Franz, Nico M.
    Friedrich, Frank
    Gkouto, George V.
    Haendel, Melissa
    Harmon, Luke J.
    Hayamizu, Terry F.
    He, Yongqun
    Hines, Heather M.
    Ibrahim, Nizar
    Jackson, Laura M.
    Jaiswal, Pankaj
    James-Zorn, Christina
    Koehler, Sebastian
    Lecointre, Guillaume
    Lapp, Hilmar
    Lawrence, Carolyn J.
    Le Novere, Nicolas
    Lundberg, John G.
    Macklin, James
    Mast, Austin R.
    Midford, Peter E.
    Miko, Istvan
    Mungall, Christopher J.
    Oellrich, Anika
    Osumi-Sutherland, David
    Parkinson, Helen
    Ramirez, Martin J.
    Richter, Stefan
    Robinson, Peter N.
    [J]. PLOS BIOLOGY, 2015, 13 (01)
  • [6] Rare inherited kidney diseases: challenges, opportunities, and perspectives
    Devuyst, Olivier
    Knoers, Nine V. A. M.
    Remuzzi, Giuseppe
    Schaefer, Franz
    [J]. LANCET, 2014, 383 (9931) : 1844 - 1859
  • [7] Interoperability between phenotypes in research and healthcare terminologies-Investigating partial mappings between HPO and SNOMED CT
    Dhombres, Ferdinand
    Bodenreider, Olivier
    [J]. JOURNAL OF BIOMEDICAL SEMANTICS, 2016, 7
  • [8] Lessons learned from additional research analyses of unsolved clinical exome cases
    Eldomery, Mohammad K.
    Coban-Akdemir, Zeynep
    Harel, Tamar
    Rosenfeld, Jill A.
    Gambin, Tomasz
    Stray-Pedersen, Asbjorg
    Kury, Sebastien
    Mercier, Sandra
    Lessel, Davor
    Denecke, Jonas
    Wiszniewski, Wojciech
    Penney, Samantha
    Liu, Pengfei
    Bi, Weimin
    Lalani, Seema R.
    Schaaf, Christian P.
    Wangler, Michael F.
    Bacino, Carlos A.
    Lewis, Richard Alan
    Potocki, Lorraine
    Graham, Brett H.
    Belmont, Johnw.
    Scaglia, Fernando
    Orange, Jordan S.
    Jhangiani, Shalini N.
    Chiang, Theodore
    Doddapaneni, Harsha
    Hu, Jianhong
    Muzny, Donna M.
    Xia, Fan
    Beaudet, Arthur L.
    Boerwinkle, Eric
    Eng, Christine M.
    Plon, Sharon E.
    Sutton, V. Reid
    Gibbs, Richard A.
    Posey, Jennifer E.
    Yang, Yaping
    Lupski, James R.
    [J]. GENOME MEDICINE, 2017, 9
  • [9] Whole genome sequencing of one complex pedigree illustrates challenges with genomic medicine
    Fang, Han
    Wu, Yiyang
    Yang, Hui
    Yoon, Margaret
    Jimenez-Barron, Laura T.
    Mittelman, David
    Robison, Reid
    Wang, Kai
    Lyon, Gholson J.
    [J]. BMC MEDICAL GENOMICS, 2017, 10
  • [10] Friedman C, 1995, Proc Annu Symp Comput Appl Med Care, P347