Attribute grouping-based naive Bayesian classifier

被引:1
作者
He, Yulin [1 ,2 ]
Ou, Guiliang [2 ]
Fournier-Viger, Philippe [2 ]
Huang, Joshua Zhexue [1 ,2 ]
机构
[1] Guangdong Lab Artificial Intelligence & Digital Ec, Shenzhen 518107, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen 518060, Peoples R China
基金
中国国家自然科学基金;
关键词
naive Bayesian classifier; attribute independence assumption; attribute grouping; dependent attribute group; posterior probability; class-conditional probability; DENSITY-ESTIMATION; ALGORITHMS;
D O I
10.1007/s11432-022-3728-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The naive Bayesian classifier (NBC) is a supervised machine learning algorithm having a simple model structure and good theoretical interpretability. However, the generalization performance of NBC is limited to a large extent by the assumption of attribute independence. To address this issue, this paper proposes a novel attribute grouping-based NBC (AG-NBC), which is a variant of the classical NBC trained with different attribute groups. AG-NBC first applies a novel effective objective function to automatically identify optimal dependent attribute groups (DAGs). Condition attributes in the same DAG are strongly dependent on the class attribute, whereas attributes in different DAGs are independent of one another. Then, for each DAG, a random vector functional link network with a SoftMax layer is trained to output posterior probabilities in the form of joint probability density estimation. The NBC is trained using the grouping attributes that correspond to the original condition attributes. Extensive experiments were conducted to validate the rationality, feasibility, and effectiveness of AG-NBC. Our findings showed that the attribute groups chosen for NBC can accurately represent attribute dependencies and reduce overlaps between different posterior probability densities. In addition, the comparative results with NBC, flexible NBC (FNBC), tree augmented Bayes network (TAN), gain ratio-based attribute weighted naive Bayes (GRAWNB), averaged one-dependence estimators (AODE), weighted AODE (WAODE), independent component analysis-based NBC (ICA-NBC), hidden naive Bayesian (HNB) classifier, and correlation-based feature weighting filter for naive Bayes (CFW) show that AG-NBC obtains statistically better testing accuracies, higher area under the receiver operating characteristic curves (AUCs), and fewer probability mean square errors (PMSEs) than other Bayesian classifiers. The experimental results demonstrate that AG-NBC is a valid and efficient approach for alleviating the attribute independence assumption when building NBCs.
引用
收藏
页数:25
相关论文
共 50 条
[1]  
Alcalá-Fdez J, 2011, J MULT-VALUED LOG S, V17, P255
[2]   Integer Linear Programming for the Bayesian network structure learning problem [J].
Bartlett, Mark ;
Cussens, James .
ARTIFICIAL INTELLIGENCE, 2017, 244 :258-271
[3]  
Bishop C.M., 2006, Pattern recognition and machine learning, DOI 10.1007/978-0-387-45528-0
[4]  
Bressan M, 2002, LECT NOTES ARTIF INT, V2527, P1
[5]  
Bressan M, 2001, PROC CVPR IEEE, P1004
[6]   A feature group weighting method for subspace clustering of high-dimensional data [J].
Chen, Xiaojun ;
Ye, Yunming ;
Xu, Xiaofei ;
Huang, Joshua Zhexue .
PATTERN RECOGNITION, 2012, 45 (01) :434-446
[7]   INDEPENDENT COMPONENT ANALYSIS, A NEW CONCEPT [J].
COMON, P .
SIGNAL PROCESSING, 1994, 36 (03) :287-314
[8]   A BAYESIAN METHOD FOR THE INDUCTION OF PROBABILISTIC NETWORKS FROM DATA [J].
COOPER, GF ;
HERSKOVITS, E .
MACHINE LEARNING, 1992, 9 (04) :309-347
[9]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[10]  
Fan LW, 2007, LECT NOTES COMPUT SC, V4507, P16