Applying Text Mining for Classifying Disease from Symptoms

被引:0
作者
Ketpupong, Pannaporn [1 ]
Piromsopa, Krerk [1 ]
机构
[1] Chulalongkorn Univ, Dept Comp Engn, Fac Engn, Bangkok, Thailand
来源
2018 18TH INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS AND INFORMATION TECHNOLOGIES (ISCIT) | 2018年
关键词
Classification; Data preparation; Disease; ICD-10-CM; Symptom; Text mining; RECORDS; ADMISSIONS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, misdiagnoses account for a significant portion of medical errors [1]. This is due to the fact that each physician's diagnosis is different depending on the physician's knowledge, skill, and experience. In several cases, physicians may ignore uncommon diseases. Also, after the diagnosis, the physician has to provide ICD-10-CM code. This is a difficult process for most (if not all) physicians. We propose a predictive model for classifying disease from symptoms by applying text mining technique. Our research technique allows physician to diagnose and to access an ICD-10-CM code directly from symptoms. Our models are based on several classifiers such as Decision Tree, Naive Bayes, Support Vector Machine, and Neural Network. Models from each classifier were compared using training time, predicting time, Receiver Operating Characteristic (ROC) curve, True Positive Rate (TPR), False Positive Rate (FPR), precision and accuracy. The result suggests that Neural Network gives the best TPR at 89.03%
引用
收藏
页码:467 / 472
页数:6
相关论文
共 19 条
[1]   Unsupervised text mining for assessing and augmenting GWAS results [J].
Ailem, Melissa ;
Role, Francois ;
Nadif, Mohamed ;
Demenais, Florence .
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 60 :252-259
[2]   Feature-ranking-based Alzheimer's disease classification from structural MRI [J].
Beheshti, Iman ;
Demirel, Hasan .
MAGNETIC RESONANCE IMAGING, 2016, 34 (03) :252-263
[3]  
Bird S, 2009, Natural language processing with python, DOI DOI 10.5555/1717171
[4]   Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge [J].
Cormack, James ;
Nath, Chinmoy ;
Milward, David ;
Raja, Kalpana ;
Jonnalagadda, Siddhartha R. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 :S120-S127
[5]   An introduction to ROC analysis [J].
Fawcett, Tom .
PATTERN RECOGNITION LETTERS, 2006, 27 (08) :861-874
[6]  
Jatunarapit P, 2016, P 8 INT C EL COMP AR, P1
[7]   Coronary artery disease risk assessment from unstructured electronic health records using text mining [J].
Jonnagaddala, Jitendra ;
Liaw, Siaw-Teng ;
Ray, Pradeep ;
Kumar, Manish ;
Chang, Nai-Wen ;
Dai, Hong-Jie .
JOURNAL OF BIOMEDICAL INFORMATICS, 2015, 58 :S203-S210
[8]   Text mining electronic hospital records to automatically classify admissions against disease: Measuring the impact of linking data sources [J].
Kocbek, Simon ;
Cavedon, Lawrence ;
Martinez, David ;
Bain, Christopher ;
Mac Manus, Chris ;
Haffari, Gholamreza ;
Zukerman, Ingrid ;
Verspoor, Karin .
JOURNAL OF BIOMEDICAL INFORMATICS, 2016, 64 :158-167
[9]  
Limpiyakorn Y, 2013, DATA MINING
[10]   Text mining approach to predict hospital admissions using early medical records from the emergency department [J].
Lucini, Filipe R. ;
Fogliatto, Flavio S. ;
da Silveira, Giovani J. C. ;
Neyeloff, Jeruza L. ;
Anzanello, Michel J. ;
Kuchenbecker, Ricardo de S. ;
Schaan, Beatriz D. .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2017, 100 :1-8