Extracting cancer concepts from clinical notes using natural language processing: a systematic review

被引:17
作者
Gholipour, Maryam [1 ]
Khajouei, Reza [2 ]
Amiri, Parastoo [1 ]
Gohari, Sadrieh Hajesmaeel [3 ]
Ahmadian, Leila [2 ]
机构
[1] Kerman Univ Med Sci, Student Res Comm, Kerman, Iran
[2] Kerman Univ Med Sci, Fac Management & Med Informat Sci, Dept Hlth Informat Sci, Kerman, Iran
[3] Kerman Univ Med Sci, Inst Futures Studies Hlth, Med Informat Res Ctr, Kerman, Iran
关键词
Neoplasms; Natural language processing; NLP; Machine learning; Terminology; Information system; Systematic review; RADIOLOGY REPORTS; CLASSIFICATION; RETRIEVAL; RECORDS;
D O I
10.1186/s12859-023-05480-0
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
BackgroundExtracting information from free texts using natural language processing (NLP) can save time and reduce the hassle of manually extracting large quantities of data from incredibly complex clinical notes of cancer patients. This study aimed to systematically review studies that used NLP methods to identify cancer concepts from clinical notes automatically.MethodsPubMed, Scopus, Web of Science, and Embase were searched for English language papers using a combination of the terms concerning "Cancer", "NLP", "Coding", and "Registries" until June 29, 2021. Two reviewers independently assessed the eligibility of papers for inclusion in the review.ResultsMost of the software programs used for concept extraction reported were developed by the researchers (n = 7). Rule-based algorithms were the most frequently used algorithms for developing these programs. In most articles, the criteria of accuracy (n = 14) and sensitivity (n = 12) were used to evaluate the algorithms. In addition, Systematized Nomenclature of Medicine-Clinical Terms (SNOMED-CT) and Unified Medical Language System (UMLS) were the most commonly used terminologies to identify concepts. Most studies focused on breast cancer (n = 4, 19%) and lung cancer (n = 4, 19%).ConclusionThe use of NLP for extracting the concepts and symptoms of cancer has increased in recent years. The rule-based algorithms are well-liked algorithms by developers. Due to these algorithms' high accuracy and sensitivity in identifying and extracting cancer concepts, we suggested that future studies use these algorithms to extract the concepts of other diseases as well.
引用
收藏
页数:16
相关论文
共 72 条
[61]   Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J].
Sung, Hyuna ;
Ferlay, Jacques ;
Siegel, Rebecca L. ;
Laversanne, Mathieu ;
Soerjomataram, Isabelle ;
Jemal, Ahmedin ;
Bray, Freddie .
CA-A CANCER JOURNAL FOR CLINICIANS, 2021, 71 (03) :209-249
[62]   Prediction of postoperative disease-free survival and brain metastasis for HER2-positive breast cancer patients treated with neoadjuvant chemotherapy plus trastuzumab using a machine learning algorithm [J].
Takada, Masahiro ;
Sugimoto, Masahiro ;
Masuda, Norikazu ;
Iwata, Hiroji ;
Kuroi, Katsumasa ;
Yamashiro, Hiroyasu ;
Ohno, Shinji ;
Ishiguro, Hiroshi ;
Inamoto, Takashi ;
Toi, Masakazu .
BREAST CANCER RESEARCH AND TREATMENT, 2018, 172 (03) :611-618
[63]   Global Cancer Statistics, 2012 [J].
Torre, Lindsey A. ;
Bray, Freddie ;
Siegel, Rebecca L. ;
Ferlay, Jacques ;
Lortet-Tieulent, Joannie ;
Jemal, Ahmedin .
CA-A CANCER JOURNAL FOR CLINICIANS, 2015, 65 (02) :87-108
[64]  
vanRijsbergen CJ, 1996, J AM SOC INFORM SCI, V47, P385, DOI 10.1002/(SICI)1097-4571(199605)47:5<385::AID-ASI6>3.0.CO
[65]  
2-S
[66]   Comparison of Natural Language Processing and Manual Coding for the Identification of Cross-Sectional Imaging Reports Suspicious for Lung Cancer [J].
Wadia, Roxanne ;
Akgun, Kathleen ;
Brandt, Cynthia ;
Fenton, Brenda T. ;
Levin, Woody ;
Marple, Andrew H. ;
Garla, Vijay ;
Rose, Michal G. ;
Taddei, Tamar ;
Taylor, Caroline .
JCO CLINICAL CANCER INFORMATICS, 2018, 2 :1-7
[67]   Natural language processing for populating lung cancer clinical research data [J].
Wang, Liwei ;
Luo, Lei ;
Wang, Yanshan ;
Wampfler, Jason ;
Yang, Ping ;
Liu, Hongfang .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (01)
[68]   A Text Mining Approach to the Prediction of Disease Status from Clinical Discharge Summaries [J].
Yang, Hui ;
Spasic, Irena ;
Keane, John A. ;
Nenadic, Goran .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2009, 16 (04) :596-600
[69]   Prediction model of the response to neoadjuvant chemotherapy in breast cancers by a Naive Bayes algorithm [J].
Yang, Libo ;
Fu, Bo ;
Li, Yan ;
Liu, Yueping ;
Huang, Wenting ;
Feng, Sha ;
Xiao, Lin ;
Sun, Linyong ;
Deng, Ling ;
Zheng, Xinyi ;
Ye, Feng ;
Bu, Hong .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2020, 192
[70]   A text processing pipeline to extract recommendations from radiology reports [J].
Yetisgen-Yildiz, Meliha ;
Gunn, Martin L. ;
Xia, Fei ;
Payne, Thomas H. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2013, 46 (02) :354-362