A Generalized Probabilistic Approach for Managing Inconsistency to Improve Classifier Accuracy

被引:0
作者
Sil, Jaya [1 ]
Sen, Jaydeep [2 ]
机构
[1] Indian Inst Engg Sci & Technol, Dept Comp Sci & Technol, Sibpur, Howrah, India
[2] IBM Res Lab, Bangalore, Karnataka, India
来源
2016 THIRD INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION PROCESSING, DATA MINING, AND WIRELESS COMMUNICATIONS (DIPDMWC) | 2016年
关键词
Rule Base Classifier; Discretization; Inconsistency; Random Process; Dimensionality Reduction; Outlier;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Removal of inconsistency from a data set contributes significantly in improving classification accuracy. Inconsistency occurs when attributes of objects have same value but they belong to different classes. Inconsistency is either inherent in the data set or appear during different data preprocessing steps, like discretization, dimensionality reduction and missing value prediction. The aim of the paper is to develop a generalized inconsistency handling scheme based on probability distribution unlike the previous methods which are context dependent. We propose two algorithms to remove inconsistency by assigning class labels to the objects afresh based on the statistical properties of the training data set. The ultimate goal of this research work is to generate consistent data which provide superior classification accuracy compare to the original data set. The proposed methods are verified with real life intrusion domain NSL-KDD data set for establishing our claim.
引用
收藏
页码:69 / 74
页数:6
相关论文
共 21 条
[1]  
[Anonymous], 2002, SURVEY DIMENSION RED
[2]  
[Anonymous], 2000, IEEE Data Eng. Bull.
[3]  
Bai Qinghai, 2010, COMPUTER INFORM SCI, V3, P180, DOI DOI 10.5539/CIS.V3N1P180
[4]  
Bakar Azuraliza Abu, 2009, 2009 2nd Conference on Data Mining and Optimization, P132, DOI 10.1109/DMO.2009.5341896
[5]  
Bruni R., 2001, LECT NOTES COMPUTER, V2189, P1
[6]  
Cong G., 2007, Proceedings of VLDB Endowment, P315
[7]  
Das K., 2010, INT J COMPUTER APPL, V5
[8]   A NORMAL-FORM FOR RELATIONAL DATABASES THAT IS BASED ON DOMAINS AND KEYS [J].
FAGIN, R .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 1981, 6 (03) :387-415
[9]  
Farid D. Md., 2009, WORLD ACAD SCI ENG T, V60, P82
[10]  
Kearns M., 1994, INTRO COMPUTATIONAL