Collection of cancer stage data by classifying free-text medical reports

被引:54
作者
McCowan, Iain A.
Moore, Darren C.
Nguyen, Anthony N.
Bowman, Rayleen V.
Clarke, Belinda E.
Duhig, Edwina E.
Fry, Mary-Jane
机构
[1] CSIRO, E Hlth Res Ctr, Brisbane, Qld 4000, Australia
[2] Univ Queensland, Dept Med, Brisbane, Qld 4000, Australia
[3] Prince Charles Hosp, Dept Anat Pathol, Brisbane, Qld 4032, Australia
[4] Queensland Hlth, Queensland Canc Control Anal Team, Brisbane, Qld, Australia
关键词
D O I
10.1197/jamia.M2130
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cancer staging provides a basis for planning clinical management, but also allows for meaningful analysis of cancer outcomes and evaluation of cancer care services. Despite this, stage data in cancer registries is often incomplete, inaccurate, or simply not collected. This article describes a prototype software system (Cancer Stage Interpretation System, CSIS) that automatically extracts cancer staging information from medical reports. The system uses text classification techniques to train support vector machines (SVMs) to extract elements of stage listed in cancer staging guidelines. When processing new reports, CSIS identifies sentences relevant to the staging decision, and subsequently assigns the most likely stage. The system was developed using a database of staging data and pathology reports for 710 lung cancer patients, then validated in an independent set of 179 patients against pathologic stage assigned by two independent pathologists. CSIS achieved overall accuracy of 74% for tumor (T) staging and 87% for node (N) staging, and errors were observed to mirror disagreements between human experts.
引用
收藏
页码:736 / 745
页数:10
相关论文
共 41 条
[1]  
AAS K, 1999, TEXT CATEGORISATION
[2]   Text categorization models for high-quality article retrieval in internal medicine [J].
Aphinyanaphongs, Y ;
Tsamardinos, I ;
Statnikov, A ;
Hardin, D ;
Aliferis, CF .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2005, 12 (02) :207-216
[3]  
*AUSTR CAN NETW MA, 2004, CLIN PRACT GUID PREV
[4]   Model for collecting colorectal cancer staging information in Western Australia [J].
Boutard, P ;
Platell, C ;
Threlfall, T .
ANZ JOURNAL OF SURGERY, 2004, 74 (10) :895-899
[5]  
BUCKLEY C, 1995, P 3 TEXT RETR C TREC, P69
[6]  
Chapman WW, 2001, J AM MED INFORM ASSN, P105
[7]   A simple algorithm for identifying negated findings and diseases in discharge summaries [J].
Chapman, WW ;
Bridewell, W ;
Hanbury, P ;
Cooper, GF ;
Buchanan, BG .
JOURNAL OF BIOMEDICAL INFORMATICS, 2001, 34 (05) :301-310
[8]   Stage at diagnosis and cancer survival for indigenous Australians in the Northern Territory [J].
Condon, JR ;
Barnes, T ;
Armstrong, BK ;
Selva-Nayagam, S ;
Elwood, JM .
MEDICAL JOURNAL OF AUSTRALIA, 2005, 182 (06) :277-280
[9]   Identifying wrist fracture patients with high accuracy by automatic categorization of x-ray reports [J].
De Brijun, Berry ;
Cranney, Ann ;
O'Donnell, Siobhan ;
Martin, Joel D. ;
Forster, Alan J. .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2006, 13 (06) :696-698
[10]  
DESITTER A, 2003, P INT WORKSH AD TEXT, P66