Improved dominance rough set-based classification system

被引:38
作者
Azar, Ahmad Taher [1 ]
Inbarani, H. Hannah [2 ]
Devi, K. Renuga [2 ]
机构
[1] Benha Univ, Fac Comp & Informat, Banha, Egypt
[2] Periyar Univ, Dept Comp Sci, Salem 636011, Tamil Nadu, India
关键词
Dominance-based rough set; Decision table; Feature selection; Classification; FUZZY FEATURE-SELECTION; KNOWLEDGE ACQUISITION; INFORMATION-SYSTEMS; LINGUISTIC HEDGES; RULES; MODEL; APPROXIMATIONS; GENERATION; REDUCTION;
D O I
10.1007/s00521-016-2177-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Feature selection and classification is widely used in many areas of science and engineering, as large datasets become increasingly common. In particular, bioscience and medical datasets routinely contain several thousands of features. For effective data mining in such databases, many methods and techniques have been developed. Rough set is a mathematical theory for dealing with uncertainty. In dominance-based rough set extension of rough set, the set of objects partitioned into pre-defined and preference-ordered classes, the new rough set approach is able to approximate this partition by means of dominance relations. This paper suggests improved dominance-based rough set for classification of medical data. Dominance-based rough set can handle ordinal attribute. This paper proposed a technique for applying dominance-based rough set for nominal attribute. This proposed work suggests decision table to determine dominance relation, and then improved dominance-based rough set is applied to find lower, upper, boundary approximations in the entire dataset. Then attribute reduction based on proposed technique is applied to find the essential attribute required for classification. This proposed method can accurately classify medical datasets collected from UCI repository Web sites. This proposed method works in seven different datasets: They are heart disease dataset, Pima Indian diabetes dataset, Breast cancer Wisconsin dataset, heart valve dataset, jaundice datasets, dermatology dataset and lung cancer dataset. Comparing the classification accuracy with rule-based classifier (Zero R, decision table), tree-based classifier (J48, Random forest, Random Tree), neural network-based classifier (multilayer perceptron), lazy classifier (IBk, KStar, LWL), Bayesian-based classifier (Naive Bayes), benchmark algorithm k-nearest-neighbour, and classical rough set approach, improved dominance-based rough set gives higher accuracy.
引用
收藏
页码:2231 / 2246
页数:16
相关论文
共 77 条
[1]   GMDH-based feature ranking and selection for improved classification of medical data [J].
Abdel-Aal, RE .
JOURNAL OF BIOMEDICAL INFORMATICS, 2005, 38 (06) :456-468
[2]  
An LP, 2011, INT J INNOV COMPUT I, V7, P1145
[3]  
Anaraki JR, 2013, 2013 5TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), P301, DOI 10.1109/IKT.2013.6620083
[4]  
[Anonymous], INT J ROUGH SETS DAT
[5]  
[Anonymous], 1999, ADV MULTIPLE CRITERI
[6]  
[Anonymous], 5 INT C MOD ID CONTR
[7]  
[Anonymous], ADV MACHINE LEARNING
[8]  
Azar A. T., 2014, STUD COMPUT INTELL, V486, P97, DOI DOI 10.1007/978-3-319-00467-9_9
[9]   Boosted Decision Trees for Vertebral Column Disease Diagnosis [J].
Azar, Ahmad Taher ;
Ali, Hanaa S. ;
Balas, Valentina E. ;
Olariu, Teodora ;
Ciurea, Rujita .
SOFT COMPUTING APPLICATIONS, (SOFA 2014), VOL 1, 2016, 356 :319-333
[10]   Dimensionality reduction of medical big data using neural-fuzzy classifier [J].
Azar, Ahmad Taher ;
Hassanien, Aboul Ella .
SOFT COMPUTING, 2015, 19 (04) :1115-1127