Ontology-based categorization of clinical studies by their conditions

被引：6

作者：

Liu, Hao ^{[1
]}

Carini, Simona ^{[2
]}

Chen, Zhehuan ^{[1
]}

Hey, Spencer Phillips ^{[3
]}

Sim, Ida ^{[2
]}

Weng, Chunhua ^{[1
,4
]}

机构：

[1] Columbia Univ, Dept Biomed Informat, New York, NY USA

[2] Univ Calif San Francisco, Dept Med, San Francisco, CA USA

[3] Prism Analyt Technol, Boston, MA USA

[4] Columbia Univ, Dept Biomed Informat, 622 W 168 ST,PH 20 room 407, New York, NY 10032 USA

来源：

JOURNAL OF BIOMEDICAL INFORMATICS | 2022年 / 135卷

关键词：

Ontology; Clinical Study; SNOMED CT; Data Visualization; Categorization; UMLS; TEXT;

D O I：

10.1016/j.jbi.2022.104235

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Objective: The free-text Condition data field in the ClinicalTrials.gov is not amenable to computational processes for retrieving, aggregating and visualizing clinical studies by condition categories. This paper contributes a method for automated ontology-based categorization of clinical studies by their conditions.Materials and Methods: Our method first maps text entries in ClinicalTrials.gov's Condition field to standard condition concepts in the OMOP Common Data Model by using SNOMED CT as a reference ontology and using Usagi for concept normalization, followed by hierarchical traversal of the SNOMED ontology for concept expansion, ontology-driven condition categorization, and visualization. We compared the accuracy of this method to that of the MeSH-based method.Results: We reviewed the 4,506 studies on Vivli.org categorized by our method. Condition terms of 4,501 (99.89%) studies were successfully mapped to SNOMED CT concepts, and with a minimum concept mapping score threshold, 4,428 (98.27%) studies were categorized into 31 predefined categories. When validating with manual categorization results on a random sample of 300 studies, our method achieved an estimated categori-zation accuracy of 95.7%, while the MeSH-based method had an accuracy of 85.0%. Conclusion: We showed that categorizing clinical studies using their Condition terms with referencing to SNOMED CT achieved a better accuracy and coverage than using MeSH terms. The proposed ontology-driven condition categorization was useful to create accurate clinical study categorization that enables clinical re-searchers to aggregate evidence from a large number of clinical studies.

引用

页数：10

共 33 条

[11] The Unified Medical Language System (UMLS): integrating biomedical terminology [J].