Evaluating Methods for Identifying Cancer in Free-Text Pathology Reports Using Various Machine Learning and Data Preprocessing Approaches

被引:4
作者
Kasthurirathne, Suranga Nath [1 ]
Dixon, Brian E. [2 ,3 ]
Grannis, Shaun J. [2 ,4 ]
机构
[1] Indiana Univ, Sch Informat & Comp, 535 W Michigan St,IT 475, Indianapolis, IN 46202 USA
[2] Regenstrief Inst Hlth Care, Indianapolis, IN USA
[3] Indiana Univ Fairbanks, Sch Publ Hlth, Indianapolis, IN USA
[4] Indiana Univ Sch Med, Indianapolis, IN 46202 USA
来源
MEDINFO 2015: EHEALTH-ENABLED HEALTH | 2015年 / 216卷
关键词
Public health reporting; decision models; ontologies; cancer; pathology; data preprocessing;
D O I
10.3233/978-1-61499-564-7-1070
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Automated detection methods can address delays and incompleteness in cancer case reporting. Existing automated efforts are largely dependent on complex dictionaries and coded data. Using a gold standard of manually reviewed pathology reports, we evaluated the performance of alternative input formats and decision models on a convenience sample of free-text pathology reports. Results showed that the input format significantly impacted performance, and specific algorithms yielded better results for presicion, recall and accuracy. We conclude that our approach is sufficiently accurate for practical purposes and represents a generalized process.
引用
收藏
页码:1070 / 1070
页数:1
相关论文
共 3 条
[1]  
Fidahussein Mustafa, 2011, AMIA Annu Symp Proc, V2011, P402
[2]   A comparison of the completeness and timeliness of automated electronic laboratory reporting and spontaneous reporting of notifiable conditions [J].
Overhage, J. Marc ;
Grannis, Shaun ;
McDonald, Clement J. .
AMERICAN JOURNAL OF PUBLIC HEALTH, 2008, 98 (02) :344-350
[3]  
Zanetti R, 2014, EUROPEAN J CANC