ClinPhen extracts and prioritizes patient phenotypes directly from medical records to expedite genetic disease diagnosis

被引:53
作者
Deisseroth, Cole A. [1 ]
Birgmeier, Johannes [1 ]
Bodle, Ethan E. [2 ]
Kohler, Jennefer N. [3 ]
Matalon, Dena R. [2 ]
Nazarenko, Yelena [4 ]
Genetti, Casie A. [5 ]
Brownstein, Catherine A. [5 ]
Schmitz-Abe, Klaus [5 ]
Schoch, Kelly [6 ]
Cope, Heidi [6 ]
Signer, Rebecca [7 ]
Network, Undiagnosed Dis
Martinez-Agosto, Julian A. [7 ,8 ,9 ]
Shashi, Vandana [6 ]
Beggs, Alan H. [5 ]
Wheeler, Matthew T. [3 ,10 ]
Bernstein, Jonathan A. [2 ]
Bejerano, Gill [1 ,2 ,4 ,11 ]
机构
[1] Stanford Univ, Dept Comp Sci, Stanford, CA 94305 USA
[2] Stanford Sch Med, Dept Pediat, Stanford, CA 94305 USA
[3] Stanford Ctr Undiagnosed Dis, Stanford, CA USA
[4] Stanford Univ, Dept Biomed Data Sci, Stanford, CA 94305 USA
[5] Harvard Med Sch, Boston Childrens Hosp, Div Genet & Genom, Manton Ctr Orphan Dis Res, Boston, MA 02115 USA
[6] Duke Univ, Sch Med, Dept Pediat, Durham, NC USA
[7] Univ Calif Los Angeles, David Geffen Sch Med, Dept Human Genet, Los Angeles, CA 90095 USA
[8] Univ Calif Los Angeles, David Geffen Sch Med, Dept Pediat, Div Med Genet, Los Angeles, CA 90095 USA
[9] Univ Calif Los Angeles, David Geffen Sch Med, Dept Psychiat, Los Angeles, CA 90095 USA
[10] Stanford Sch Med, Dept Med, Stanford, CA 94305 USA
[11] Stanford Univ, Dept Dev Biol, Stanford, CA 94305 USA
关键词
medical genetics; Mendelian disease diagnosis; natural language processing; prioritized disease phenotypes; VARIANTS; ONTOLOGY; IDENTIFICATION; ARCHITECTURE; DISCOVERY; ACCURATE; TEXT;
D O I
10.1038/s41436-018-0381-1
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Purpose: Diagnosing monogenic diseases facilitates optimal care, but can involve the manual evaluation of hundreds of genetic variants per case. Computational tools like Phrank expedite this process by ranking all candidate genes by their ability to explain the patient's phenotypes. To use these tools, busy clinicians must manually encode patient phenotypes from lengthy clinical notes. With 100 million human genomes estimated to be sequenced by 2025, a fast alternative to manual phenotype extraction from clinical notes will become necessary. Methods: We introduce ClinPhen, a fast, high-accuracy tool that automatically converts clinical notes into a prioritized list of patient phenotypes using Human Phenotype Ontology (HPO) terms. Results: ClinPhen shows superior accuracy and 20x speedup over existing phenotype extractors, and its novel phenotype prioritization scheme improves the performance of gene-ranking tools. Conclusion: While a dedicated clinician can process 200 patient records in a 40-hour workweek, ClinPhen does the same in 10 minutes. Compared with manual phenotype extraction, ClinPhen saves an additional 3-5 hours per Mendelian disease diagnosis. Providers can now add ClinPhen's output to each summary note attached to a filled testing laboratory request form. ClinPhen makes a substantial contribution to improvements in efficiency critically needed to meet the surging demand for clinical diagnostic sequencing.
引用
收藏
页码:1585 / 1593
页数:9
相关论文
共 38 条
  • [1] OMIM.org: Online Mendelian Inheritance in Man (OMIM®), an online catalog of human genes and genetic disorders
    Amberger, Joanna S.
    Bocchini, Carol A.
    Schiettecatte, Francois
    Scott, Alan F.
    Hamosh, Ada
    [J]. NUCLEIC ACIDS RESEARCH, 2015, 43 (D1) : D789 - D798
  • [2] Aronson AR, 2001, J AM MED INFORM ASSN, P17
  • [3] Bayesian ontology querying for accurate and noise-tolerant semantic searches
    Bauer, Sebastian
    Koehler, Sebastian
    Schulz, Marcel H.
    Robinson, Peter N.
    [J]. BIOINFORMATICS, 2012, 28 (19) : 2502 - 2508
  • [4] Bird Steven., 2004, P ACL INT POST DEM S, P214
  • [5] Compelling Reasons for Repairing Human Germlines
    Church, George
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2017, 377 (20) : 1909 - 1911
  • [6] Complex epilepsy phenotype extraction from narrative clinical discharge summaries
    Cui, Licong
    Sahoo, Satya S.
    Lhatoo, Samden D.
    Garg, Gaurav
    Rai, Prashant
    Bozorgi, Alireza
    Zhang, Guo-Qiang
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2014, 51 : 272 - 279
  • [7] Cui Licong, 2012, AMIA Annu Symp Proc, V2012, P1191
  • [8] SNPeffect 4.0: on-line prediction of molecular and structural effects of protein-coding variants
    De Baets, Greet
    Van Durme, Joost
    Reumers, Joke
    Maurer-Stroh, Sebastian
    Vanhee, Peter
    Dopazo, Joaquin
    Schymkowitz, Joost
    Rousseau, Frederic
    [J]. NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) : D935 - D939
  • [9] Clinical Interpretation and Implications of Whole-Genome Sequencing
    Dewey, Frederick E.
    Grove, Megan E.
    Pan, Cuiping
    Goldstein, Benjamin A.
    Bernstein, Jonathan A.
    Chaib, Hassan
    Merker, Jason D.
    Goldfeder, Rachel L.
    Enns, Gregory M.
    David, Sean P.
    Pakdaman, Neda
    Ormond, Kelly E.
    Caleshu, Colleen
    Kingham, Kerry
    Klein, Teri E.
    Whirl-Carrillo, Michelle
    Sakamoto, Kenneth
    Wheeler, Matthew T.
    Butte, Atul J.
    Ford, James M.
    Boxer, Linda
    Ioannidis, John P. A.
    Yeung, Alan C.
    Altman, Russ B.
    Assimes, Themistocles L.
    Snyder, Michael
    Ashley, Euan A.
    Quertermous, Thomas
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2014, 311 (10): : 1035 - 1044
  • [10] The Human Phenotype Ontology: Semantic Unification of Common and Rare Disease
    Groza, Tudor
    Koehler, Sebastian
    Moldenhauer, Dawid
    Vasilevsky, Nicole
    Baynam, Gareth
    Zemojtel, Tomasz
    Schriml, Lynn Marie
    Kibbe, Warren Alden
    Schofield, Paul N.
    Beck, Tim
    Vasant, Drashtti
    Brookes, Anthony J.
    Zankl, Andreas
    Washington, Nicole L.
    Mungall, Christopher J.
    Lewis, Suzanna E.
    Haendel, Melissa A.
    Parkinson, Helen
    Robinson, Peter N.
    [J]. AMERICAN JOURNAL OF HUMAN GENETICS, 2015, 97 (01) : 111 - 124