Dynamic categorization of clinical research eligibility criteria by hierarchical clustering

被引:34
|
作者
Luo, Zhihui [1 ]
Yetisgen-Yildiz, Meliha [2 ]
Weng, Chunhua [1 ]
机构
[1] Columbia Univ, Dept Biomed Informat, New York, NY 10032 USA
[2] Univ Washington, Seattle, WA 98195 USA
关键词
Clinical research eligibility criteria; Classification; Hierarchical clustering; Knowledge representation; Unified Medical Language System (UMLS); Machine learning; Feature representation; CLASSIFICATION;
D O I
10.1016/j.jbi.2011.06.001
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: To semi-automatically induce semantic categories of eligibility criteria from text and to automatically classify eligibility criteria based on their semantic similarity. Design: The UMLS semantic types and a set of previously developed semantic preference rules were utilized to create an unambiguous semantic feature representation to induce eligibility criteria categories through hierarchical clustering and to train supervised classifiers. Measurements: We induced 27 categories and measured the prevalence of the categories in 27,278 eligibility criteria from 1578 clinical trials and compared the classification performance (i.e., precision, recall, and F1-score) between the UMLS-based feature representation and the "bag of words" feature representation among five common classifiers in Weka, including J48, Bayesian Network, Naive Bayesian, Nearest Neighbor, and instance-based learning classifier. Results: The UMLS semantic feature representation outperforms the "bag of words" feature representation in 89% of the criteria categories. Using the semantically induced categories, machine-learning classifiers required only 2000 instances to stabilize classification performance. The J48 classifier yielded the best F1-score and the Bayesian Network classifier achieved the best learning efficiency. Conclusion: The UMLS is an effective knowledge source and can enable an efficient feature representation for semi-automated semantic category induction and automatic categorization for clinical research eligibility criteria and possibly other clinical text. (C) 2011 Elsevier Inc. All rights reserved.
引用
收藏
页码:927 / 935
页数:9
相关论文
共 50 条
  • [31] Validity of the Research Diagnostic Criteria for Temporomandibular Disorders Axis I in Clinical and Research Settings
    Steenks, Michel H.
    de Wijer, Anton
    JOURNAL OF OROFACIAL PAIN, 2009, 23 (01): : 9 - 16
  • [32] Parameter Set Selection for Dynamic Systems under Uncertainty via Dynamic Optimization and Hierarchical Clustering
    Dai, Wei
    Bansal, Loveleena
    Hahn, Juergen
    Word, Daniel
    AICHE JOURNAL, 2014, 60 (01) : 181 - 192
  • [33] Clinical and research diagnostic criteria for developmental coordination disorder: a review and discussion
    Geuze, RH
    Jongmans, MJ
    Schoemaker, MM
    Smits-Engelsman, BCM
    HUMAN MOVEMENT SCIENCE, 2001, 20 (1-2) : 7 - 47
  • [34] Operationalizing NIMH Research Domain Criteria (RDoC) in naturalistic clinical settings
    Sharp, Carla
    Fowler, J. Christopher
    Salas, Ramiro
    Nielsen, David
    Allen, Jon
    Oldham, John
    Kosten, Thomas
    Mathew, Sanjay
    Madan, Alok
    Frueh, B. Christopher
    Fonagy, Peter
    BULLETIN OF THE MENNINGER CLINIC, 2016, 80 (03) : 187 - 212
  • [35] Research on deduplication method of multiple relations based on hierarchical clustering algorithm
    Wang Y.
    Cheng W.
    Liu C.
    International Journal of Information and Communication Technology, 2023, 22 (02): : 105 - 116
  • [36] Research on Topic Detection of Network Public Opinion Based on Hierarchical Clustering
    Liu, Lu
    Jiang, Zheng-tao
    INTERNATIONAL CONFERENCE ON SIMULATION, MODELLING AND MATHEMATICAL STATISTICS (SMMS 2015), 2015, : 291 - 295
  • [37] Output Curves Based Hierarchical Clustering Screening Method with Static/Dynamic Current Balancing for Paralleled SiC MOSFETs
    Zheng F.
    Meng H.
    Zhou Z.
    Xu H.
    Luo H.
    Li W.
    CPSS Transactions on Power Electronics and Applications, 2023, 8 (03): : 257 - 268
  • [38] Adaptive damage localization based on locally perturbed dynamic equilibrium and hierarchical clustering
    Cao, Shancheng
    Ouyang, Huajiang
    Cheng, Li
    SMART MATERIALS AND STRUCTURES, 2019, 28 (07)
  • [39] A Speed-Up Hierarchical Compact Clustering Algorithm for Dynamic Document Collections
    Gil-Garcia, Reynaldo
    Pons-Porrata, Aurora
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS, COMPUTER VISION, AND APPLICATIONS, PROCEEDINGS, 2009, 5856 : 379 - 386
  • [40] Towards the DSM-5 Criteria for Autism: Clinical, Cultural, and Research Implications
    Vivanti, Giacomo
    Hudry, Kristelle
    Trembath, David
    Barbaro, Josephine
    Richdale, Amanda
    Dissanayake, Cheryl
    AUSTRALIAN PSYCHOLOGIST, 2013, 48 (04) : 258 - 261