Automatic Extraction of Cancer Characteristics from Free-Text Pathology Reports for Cancer Notifications

被引:12
作者
Anthony Nguyen [1 ]
Moore, Julie
Lawley, Michael [1 ]
Hansen, David [1 ]
Colquist, Shoni
机构
[1] CSIRO, ICT Ctr, Australian E Hlth Res Ctr, Brisbane, Qld, Australia
来源
HEALTH INFORMATICS: THE TRANSFORMATIVE POWER OF INNOVATION | 2011年 / 168卷
关键词
Automatic Data Processing; Data Mining; Disease Notification; Neoplasm; Systematised Nomenclature of Medicine; RETRIEVAL;
D O I
10.3233/978-1-60750-791-8-117
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: To develop a system for the automatic classification of Cancer Registry notifications data from free-text pathology reports. Method: The underlying technology used for the extraction of cancer notification items is based on the symbolic rule-based classification methodology, whereby formal semantics are used to reason with the systematised nomenclature of medicine - clinical terms (SNOMED CT) concepts identified in the free text. Business rules for cancer notifications used by Cancer Registry coding staff were also incorporated with the aim to mimic Cancer Registry processes. Results: The system was developed on a corpus of 239 histology and cytology reports (with 60% notifiable reports), and then evaluated on an independent set of 300 reports (with 20% notifiable reports). Results show that the system can reliably classify notifiable reports with 96% and 100% specificity, and achieve an overall accuracy of 82% and 74% for classifying notification items from notifiable reports at a unit record level from the development and evaluation set, respectively. Conclusion: Cancer Registries collect a multitude of data that requires manual review, slowing down the flow of information. Extracting and providing an automatically coded cancer pathology notification for review can lessen the reliance on expert clinical staff, improving the efficiency and availability of cancer information.
引用
收藏
页码:117 / 124
页数:8
相关论文
共 15 条
  • [1] [Anonymous], 2007, INT CLASSIFICATION D
  • [2] An overview of MetaMap: historical perspective and recent advances
    Aronson, Alan R.
    Lang, Francois-Michel
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) : 229 - 236
  • [3] Automatically extracting cancer disease characteristics from pathology reports into a Disease Knowledge Representation Model
    Coden, Anni
    Savova, Guergana
    Sominsky, Igor
    Tanenblatt, Michael
    Masanz, James
    Schuler, Karin
    Cooper, James
    Guan, Wei
    de Groen, Piet C.
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (05) : 937 - 949
  • [4] Comparison with manual registration reveals satisfactory completeness and efficiency of a computerized cancer registration system
    Contiero, Paolo
    Tittarelli, Andrea
    Maghini, Anna
    Fabiano, Sabrina
    Frassoldi, Ernanuela
    Costa, Enrica
    Gada, Daniela
    Codazzi, Tiziana
    Crosignani, Paolo
    Tessandori, Roberto
    Tagliabue, Glovanna
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2008, 41 (01) : 24 - 32
  • [5] caTIES: a grid based system for coding and retrieval of surgical pathology reports and tissue specimens in support of translational research
    Crowley, Rebecca S.
    Castine, Melissa
    Mitchell, Kevin
    Chavan, Girish
    McSherry, Tara
    Feldman, Michael
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) : 253 - 264
  • [6] Evaluation of a generalizable approach to clinical information retrieval using the automated retrieval console (ARC)
    D'Avolio, Leonard W.
    Nguyen, Thien M.
    Farwell, Wildon R.
    Chen, Yongming
    Fitzmeyer, Felicia
    Harris, Owen M.
    Fiore, Louis D.
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (04) : 375 - 382
  • [7] Dale D., 2002, Journal of Registry Management, V29, P52
  • [8] Hanauer DA, 2006, J CLIN ONCOL, V24, p320S
  • [9] International Health Terminology Standards Development Organisation, 2007, SNOMED CLIN TERMS TR
  • [10] International Health Terminology Standards Development Organisation, 2008, SNOMED CLIN TERM US