An Ensemble of Naive Bayes Classifiers for Uncertain Categorical Data

被引:5
作者
de Holanda Maia, Marcelo Rodrigues [1 ,2 ]
Plastino, Alexandre [1 ]
Freitas, Alex A. [3 ]
机构
[1] Univ Fed Fluminense, Inst Comp, Niteroi, RJ, Brazil
[2] Inst Brasileiro Geog & Estat, Rio De Janeiro, RJ, Brazil
[3] Univ Kent, Sch Comp, Canterbury, Kent, England
来源
2021 21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2021) | 2021年
关键词
Classification; Ensemble; Uncertain data; Naive Bayes; Bioinformatics; CLASSIFICATION;
D O I
10.1109/ICDM51629.2021.00148
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Coping with uncertainty is a very challenging issue in many real-world applications. However, conventional classification models usually assume there is no uncertainty in data at all. In order to fill this gap, there has been a growing number of studies addressing the problem of classification based on uncertain data. Although some methods resort to ignoring uncertainty or artificially removing it from data, it has been shown that predictive performance can be improved by actually incorporating information on uncertainty into classification models. This paper proposes an approach for building an ensemble of classifiers for uncertain categorical data based on biased random subspaces. Using Naive Bayes classifiers as base models, we have applied this approach to classify ageing-related genes based on real data, with uncertain features representing protein-protein interactions. Our experimental results show that models based on the proposed approach achieve better predictive performance than single Naive Bayes classifiers and conventional ensembles.
引用
收藏
页码:1222 / 1227
页数:6
相关论文
共 19 条
[1]   Nearest Neighbor-Based Classification of Uncertain Data [J].
Angiulli, Fabrizio ;
Fassetti, Fabio .
ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2013, 7 (01)
[2]  
[Anonymous], MACH LEARN, DOI [10.1023/A:1010933404324, DOI 10.1023/A:1010933404324]
[3]   An empirical comparison of voting classification algorithms: Bagging, boosting, and variants [J].
Bauer, E ;
Kohavi, R .
MACHINE LEARNING, 1999, 36 (1-2) :105-139
[4]  
Cen Wan, 2013, 2013 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), P373, DOI 10.1109/BIBM.2013.6732521
[5]  
da Silva Pablo Nascimento, 2018, P 2018 SIAM INT C DA, P738
[6]  
Freitas A. A, 2020, IEEE ACM T COMPUT BI
[7]  
Ge JQ, 2010, LECT NOTES ARTIF INT, V6118, P449
[8]  
Ho TK, 1998, IEEE T PATTERN ANAL, V20, P832, DOI 10.1109/34.709601
[9]   Deciphering the effects of gene deletion on yeast longevity using network and machine learning approaches [J].
Huang, Tao ;
Zhang, Jian ;
Xu, Zhong-Ping ;
Hu, Le-Le ;
Chen, Lei ;
Shao, Jian-Lin ;
Zhang, Lei ;
Kong, Xiang-Yin ;
Cai, Yu-Dong ;
Chou, Kuo-Chen .
BIOCHIMIE, 2012, 94 (04) :1017-1025
[10]  
Martire I., 2017, P 5 S KNOWL DISC MIN, P81