A machine learning method for incomplete and imbalanced medical data

被引:0
作者
Salman, Issam [1 ]
Vomlel, Jiri [2 ]
机构
[1] Czech Tech Univ, Fac Nucl Sci & Phys Engn, Dept Software Engn, Trojanova 13, Prague 12000, Czech Republic
[2] Acad Sci Czech Republ, Inst Informat Theory & Automat, Pod Vodarenskou Vezi 4, CR-18208 Prague, Czech Republic
来源
PROCEEDINGS OF THE 20TH CZECH-JAPAN SEMINAR ON DATA ANALYSIS AND DECISION MAKING UNDER UNCERTAINTY | 2017年
关键词
Machine Learning; Data analysis; Bayesian networks; Missing data; Imbalanced data; Acute Myocardial Infarction;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Our research reported in this paper is twofold. In the first part of the paper we use standard statistical methods to analyze medical records of patients suffering myocardial infarction from the third world Syria and a developed country - the Czech Republic. One of our goals is to find whether there are statistically significant differences between the two countries. In the second part of the paper we present an idea how to deal with incomplete and imbalanced data for tree-augmented naive Bayesian (TAN). All results presented in this paper are based on a real data about 603 patients from a hospital in the Czech Republic and about 184 patients from two hospitals in Syria.
引用
收藏
页码:188 / 195
页数:8
相关论文
共 13 条
[1]   SMOTE: Synthetic minority over-sampling technique [J].
Chawla, Nitesh V. ;
Bowyer, Kevin W. ;
Hall, Lawrence O. ;
Kegelmeyer, W. Philip .
2002, American Association for Artificial Intelligence (16)
[2]   Semisupervised learning of classifiers: Theory, algorithms, and their application to human-computer interaction [J].
Cohen, I ;
Cozman, FG ;
Sebe, N ;
Cirelo, MC ;
Huang, TS .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2004, 26 (12) :1553-1567
[3]  
Francois OCH, 2006, 3 EUR WORKSH PROB GR, P91
[4]   Bayesian network classifiers [J].
Friedman, N ;
Geiger, D ;
Goldszmidt, M .
MACHINE LEARNING, 1997, 29 (2-3) :131-163
[5]  
Greiner K, 2002, EIGHTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AAAI-02)/FOURTEENTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE (IAAI-02), PROCEEDINGS, P167
[6]  
Hall M., 2009, SIGKDD EXPLORATIONS, V11, P10, DOI [DOI 10.1145/1656274.1656278, 10.1145/1656274.1656278]
[7]  
Jensen FV, 1996, An introduction to Bayesian networks
[8]  
Krumholz HM., 2007, RISK ADJUSTMENT MODE
[9]   Evaluating the effect of unbalanced data in biomedical document classification [J].
Laza, Rosalia ;
Pavon, Reyes ;
Reboiro-Jato, Miguel ;
Fdez-Riverola, Florentino .
JOURNAL OF INTEGRATIVE BIOINFORMATICS, 2011, 8 (03)
[10]  
Pearl J., 1988, Morgan Kaufmann series in representation and reasoning, DOI DOI 10.1016/C2009-0-27609-4