Combining naive Bayes and n-gram language models for text classification

被引:0
作者
Peng, FC [1 ]
Schuurmans, D [1 ]
机构
[1] Univ Waterloo, Sch Comp Sci, Waterloo, ON N2L 3G1, Canada
来源
ADVANCES IN INFORMATION RETRIEVAL | 2003年 / 2633卷
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We augment the naive Bayes model with an n-gram. language model to address two shortcomings of naive Bayes text classifiers. The chain augmented naive Bayes classifiers we propose have two advantages over standard naive Bayes classifiers. First, a chain augmented naive Bayes model relaxes some of the independence assumptions of naive Bayes-allowing a local Markov chain dependence in the observed variables-while still permitting efficient inference and learning. Second, smoothing techniques from statistical language modeling can be used to recover better estimates than the Laplace smoothing techniques usually used in naive Bayes classification. Our experimental results on three real world data sets show that we achieve substantial improvements over standard naive Bayes classification, while also achieving state of the art performance that competes with the best known methods in these cases.
引用
收藏
页码:335 / 350
页数:16
相关论文
共 27 条
[1]  
[Anonymous], THESIS U TWENTE
[2]  
Bell T. C., 1990, TEXT COMPRESSION
[3]  
CAVNER W, 1994, P SDAIR 94
[4]  
CHEN S.F., 1998, EMPIRICAL STUDY SMOO
[5]   On the optimality of the simple Bayesian classifier under zero-one loss [J].
Domingos, P ;
Pazzani, M .
MACHINE LEARNING, 1997, 29 (2-3) :103-130
[6]  
EYHERAMENDY S, 2003, IN PRESS ARTIFICAL I
[7]   Bayesian network classifiers [J].
Friedman, N ;
Geiger, D ;
Goldszmidt, M .
MACHINE LEARNING, 1997, 29 (2-3) :131-163
[8]  
Hart P.E., 1973, Pattern recognition and scene analysis
[9]  
HE J, 2000, P PRICAI 2000 INT WO, P24
[10]  
KEOGH E, 1999, ARTIFICIAL INTELLIGE