Assessment of Electronic Health Record for Cancer Research and Patient Care Through a Scoping Review of Cancer Natural Language Processing

被引:23
作者
Wang, Liwei [1 ]
Fu, Sunyang [1 ]
Wen, Andrew [1 ]
Ruan, Xiaoyang [1 ]
He, Huan [1 ]
Liu, Sijia [1 ]
Moon, Sungrim [1 ]
Mai, Michelle [1 ]
Riaz, Irbaz B. [2 ]
Wang, Nan [3 ]
Yang, Ping [4 ]
Xu, Hua [5 ]
Warner, Jeremy L. [6 ,7 ,8 ]
Liu, Hongfang [1 ]
机构
[1] Mayo Clin, Dept Artificial Intelligence & Informat, Rochester, MN USA
[2] Mayo Clin, Dept Hematol Oncol, Scottsdale, AZ USA
[3] Univ Minnesota, Dept Comp Sci & Engn, Coll Sci & Engn, Minneapolis, MN USA
[4] Mayo Clin, Dept Quantitat Hlth Sci, Scottsdale, AZ USA
[5] Univ Texas Hlth Sci Ctr Houston, Sch Biomed Informat, Houston, TX 77030 USA
[6] Vanderbilt Univ, Dept Med, Nashville, TN USA
[7] Vanderbilt Univ, Dept Hematol Oncol, 221 Kirkland Hall, Nashville, TN 37235 USA
[8] Vanderbilt Univ, Dept Biomed Informat, 221 Kirkland Hall, Nashville, TN 37235 USA
来源
JCO CLINICAL CANCER INFORMATICS | 2022年 / 6卷
基金
美国国家卫生研究院;
关键词
CLINICAL DECISION-SUPPORT; RADIOLOGY REPORTS; BREAST-CANCER; AUTOMATED IDENTIFICATION; INFORMATION EXTRACTION; PATHOLOGICAL FINDINGS; PANCREATIC-CANCER; COLONOSCOPY; SYSTEM; VALIDATION;
D O I
10.1200/CCI.22.00006
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
PURPOSEThe advancement of natural language processing (NLP) has promoted the use of detailed textual data in electronic health records (EHRs) to support cancer research and to facilitate patient care. In this review, we aim to assess EHR for cancer research and patient care by using the Minimal Common Oncology Data Elements (mCODE), which is a community-driven effort to define a minimal set of data elements for cancer research and practice. Specifically, we aim to assess the alignment of NLP-extracted data elements with mCODE and review existing NLP methodologies for extracting said data elements.METHODSPublished literature studies were searched to retrieve cancer-related NLP articles that were written in English and published between January 2010 and September 2020 from main literature databases. After the retrieval, articles with EHRs as the data source were manually identified. A charting form was developed for relevant study analysis and used to categorize data including four main topics: metadata, EHR data and targeted cancer types, NLP methodology, and oncology data elements and standards.RESULTSA total of 123 publications were selected finally and included in our analysis. We found that cancer research and patient care require some data elements beyond mCODE as expected. Transparency and reproductivity are not sufficient in NLP methods, and inconsistency in NLP evaluation exists.CONCLUSIONWe conducted a comprehensive review of cancer NLP for research and patient care using EHRs data. Issues and barriers for wide adoption of cancer NLP were identified and discussed.
引用
收藏
页数:17
相关论文
共 137 条
  • [1] AAlAbdulsalam Abdulrahman K, 2018, AMIA Jt Summits Transl Sci Proc, V2017, P16
  • [2] Agaronnik ND., 2020, ARCH PHYS MED REHAB, V21, P21
  • [3] Natural language processing for the development of a clinical registry: a validation study in intraductal papillary mucinous neoplasms
    Al-Haddad, Mohammad A.
    Friedlin, Jeff
    Kesterson, Joe
    Waters, Joshua A.
    Aguilar-Saavedra, Juan R.
    Schmidt, C. Max
    [J]. HPB, 2010, 12 (10) : 688 - 695
  • [4] Ananda-Rajah MR, 2017, JCO CLIN CANCER INFO, V1, DOI 10.1200/CCI.17.00011
  • [5] Facilitating Surveillance of Pulmonary Invasive Mold Diseases in Patients with Haematological Malignancies by Screening Computed Tomography Reports Using Natural Language Processing
    Ananda-Rajah, Michelle R.
    Martinez, David
    Slavin, Monica A.
    Cavedon, Lawrence
    Dooley, Michael
    Cheng, Allen
    Thursky, Karin A.
    [J]. PLOS ONE, 2014, 9 (09):
  • [6] Arksey H., 2005, INT J SOC RES METHOD, V8, P19, DOI 10.1080/1364557032000119616
  • [7] University of California, Irvine-Pathology Extraction Pipeline: The pathology extraction pipeline for information extraction from pathology reports
    Ashish, Naveen
    Dahm, Lisa
    Boicey, Charles
    [J]. HEALTH INFORMATICS JOURNAL, 2014, 20 (04) : 288 - 305
  • [8] Banerjee Imon, 2019, JCO Clin Cancer Inform, V3, P1, DOI 10.1200/CCI.19.00034
  • [9] Quantitating and assessing interoperability between electronic health records
    Bernstam, Elmer, V
    Warner, Jeremy L.
    Krauss, John C.
    Ambinder, Edward
    Rubinstein, Wendy S.
    Komatsoulis, George
    Miller, Robert S.
    Chen, James L.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2022, 29 (05) : 753 - 760
  • [10] Automatic Lung-RADS™ classification with a natural language processing system
    Beyer, Sebastian E.
    McKee, Brady J.
    Regis, Shawn M.
    McKee, Andrea B.
    Flacke, Sebastian
    El Saadawi, Gilan
    Wald, Christoph
    [J]. JOURNAL OF THORACIC DISEASE, 2017, 9 (09) : 3114 - +