Causality-Based Attribute Weighting via Information Flow and Genetic Algorithm for Naive Bayes Classifier

被引:19
作者
Li, Ming [1 ]
Liu, Kefeng [1 ]
机构
[1] Natl Univ Def Technol, Coll Meteorol & Oceanog, Nanjing 211101, Jiangsu, Peoples R China
基金
中国国家自然科学基金;
关键词
Genetic algorithms; Correlation; Weight measurement; Sociology; Classification algorithms; Bayes methods; Naive Bayes; attribute weighting; causality; information flow; genetic algorithm;
D O I
10.1109/ACCESS.2019.2947568
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Naive Bayes classifier (NBC) is an effective classification technique in data mining and machine learning, which is based on the attribute conditional independence assumption. However, this assumption rarely holds true in real-world applications, so numerous researches have been made to alleviate the assumption by attribute weighting. To the best of our knowledge, almost all studies have calculated attribute weights according to correlation measure or classification accuracy. In this paper, we propose a novel causality-based attribute weighting method to establish the weighted NBC called IFG-WNBC, where causal information flow (IF) theory and genetic algorithm (GA) are adopted to search for optimal weights. The introduction of IF produces a bran-new weight measure criterion from the angle of causality other than correlation. The population initialization in GA is also improved with IF-based weights for efficient optimization. Multi-set of comparison experiments on UCI data sets demonstrate that IFG-WNBC achieves superiority over classic NBC and other common weighted NBC algorithms in classification accuracy and running time.
引用
收藏
页码:150630 / 150641
页数:12
相关论文
共 28 条
[1]  
[Anonymous], MATH MODELS HANDLING
[2]  
[Anonymous], P EGU GEN ASS C
[3]   Forecasting the Tropical Cyclone Genesis over the Northwest Pacific through Identifying the Causal Factors in Cyclone-Climate Interactions [J].
Bai, Chengzu ;
Zhang, Ren ;
Bao, Senliang ;
Liang, X. San ;
Guo, Wenbo .
JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY, 2018, 35 (02) :247-259
[4]  
Bao Y. J., 2013, J YUNNAN NATL U, V27, P79
[5]  
Chang-Hwan Lee, 2011, Proceedings of the 2011 IEEE 11th International Conference on Data Mining (ICDM 2011), P1146, DOI 10.1109/ICDM.2011.29
[6]   A fast and elitist multiobjective genetic algorithm: NSGA-II [J].
Deb, K ;
Pratap, A ;
Agarwal, S ;
Meyarivan, T .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (02) :182-197
[7]  
Deng W. B., 2007, COMPUT SCI, V34, P2
[8]   Bayesian network classifiers [J].
Friedman, N ;
Geiger, D ;
Goldszmidt, M .
MACHINE LEARNING, 1997, 29 (2-3) :131-163
[9]  
Fu S. Z., 2012, BAYESIAN NETWORK THE
[10]  
Garg A., 2001, EMCL 01 P 12 EUROPEA, P179