Clinical trial cohort selection based on multi-level rule-based natural language processing system

被引:34
作者
Chen, Long [1 ]
Gu, Yu [1 ]
Ji, Xin [1 ]
Lou, Chao [1 ]
Sun, Zhiyong [1 ]
Li, Haodan [1 ]
Gao, Yuan [1 ]
Huang, Yang [1 ]
机构
[1] Med Data Quest Inc, 505 Coast Blvd S, La Jolla, CA 92037 USA
关键词
clinical natural language processing; cohort selection; clinical trial; rule-based system; UMLS; INFORMATION; RECORDS;
D O I
10.1093/jamia/ocz109
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: Identifying patients who meet selection criteria for clinical trials is typically challenging and time-consuming. In this article, we describe our clinical natural language processing (NLP) system to automatically assess patients' eligibility based on their longitudinal medical records. This work was part of the 2018 National NLP Clinical Challenges (n2c2) Shared-Task and Workshop on Cohort Selection for Clinical Trials. Materials and Methods: The authors developed an integrated rule-based clinical NLP system which employs a generic rule-based framework plugged in with lexical-, syntactic- and meta-level, task-specific knowledge inputs. In addition, the authors also implemented and evaluated a general clinical NLP (cNLP) system which is built with the Unified Medical Language System and Unstructured Information Management Architecture. Results and Discussion: The systems were evaluated as part of the 2018 n2c2-1 challenge, and authors' rule-based system obtained an F-measure of 0.9028, ranking fourth at the challenge and had less than 1% difference from the best system. While the general cNLP system didn't achieve performance as good as the rule-based system, it did establish its own advantages and potential in extracting clinical concepts. Conclusion: Our results indicate that a well-designed rule-based clinical NLP system is capable of achieving good performance on cohort selection even with a small training data set. In addition, the investigation of a Unified Medical Language System-based general cNLP system suggests that a hybrid system combining these 2 approaches is promising to surpass the state-of-the-art performance.
引用
收藏
页码:1218 / 1226
页数:9
相关论文
共 34 条
  • [1] A Leray-Schauder alternative for weakly-strongly sequentially continuous weakly compact maps
    Agarwal, Ravi P.
    O'Regan, Donal
    Liu, Xinzhi
    [J]. FIIXED POINT THEORY AND APPLICATIONS, 2005, 2005 (01): : 1 - 10
  • [2] [Anonymous], HEDIS PERF MEAS
  • [3] An overview of MetaMap: historical perspective and recent advances
    Aronson, Alan R.
    Lang, Francois-Michel
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) : 229 - 236
  • [4] Using Electronic Health Records for Population Health Research: A Review of Methods and Applications
    Casey, Joan A.
    Schwartz, Brian S.
    Stewart, Walter F.
    Adler, Nancy E.
    [J]. ANNUAL REVIEW OF PUBLIC HEALTH, VOL 37, 2016, 37 : 61 - 81
  • [5] A simple algorithm for identifying negated findings and diseases in discharge summaries
    Chapman, WW
    Bridewell, W
    Hanbury, P
    Cooper, GF
    Buchanan, BG
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2001, 34 (05) : 301 - 310
  • [6] Chiticariu Laura, 2013, P 2013 C EMP METH NA, P827
  • [7] Crawford Mark, 2013, J AHIMA, V84, P24
  • [8] Farkas Richard, 2010, Proceedings of the Fourteenth Conference on Computational Natural Language Learning (CoNLL-2010): Shared Task, P1
  • [9] A method for cohort selection of cardiovascular disease records from an electronic health record system
    Fernandes Abrahao, Maria Tereza
    Cuce Nobre, Moacyr Roberto
    Gutierrez, Marco Antonio
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2017, 102 : 138 - 149
  • [10] A GENERAL NATURAL-LANGUAGE TEXT PROCESSOR FOR CLINICAL RADIOLOGY
    FRIEDMAN, C
    ALDERSON, PO
    AUSTIN, JHM
    CIMINO, JJ
    JOHNSON, SB
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 1994, 1 (02) : 161 - 174