NAIVE BAYESIAN AND K-NEAREST NEIGHBOUR TO CATEGORIZE ARABIC TEXT DATA

被引:0
作者
Hadi, Wa'el Musa
Thabtah, Fadi
Hawari, Samer A. L.
Ababneh, Jafar
机构
来源
EUROPEAN SIMULATION AND MODELLING CONFERENCE 2008 | 2008年
关键词
Text Categorization; Naive Bayesian; Arabic Text Data; K-Nearest Neighbour;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Text classification is a supervised learning technique that uses labelled training data to derive a classification system (classifier) and then automatically classifies unlabelled text data using the derived classifier. This paper investigates Naive Bayesian method (NB) and K-Nearest Neighbour algorithm (KNN) on different Arabic data sets. The bases of our comparison are the most popular text evaluation measures. The Experimental results against different Arabic text categorisation data sets reveal that NB algorithm outperforms the KNN based on Cosine Coefficient approach with regards to all measures.
引用
收藏
页码:196 / 200
页数:5
相关论文
共 19 条
[1]  
[Anonymous], J COMPUTER SCI, DOI DOI 10.3844/JCSSP.2023.20.56
[2]   Integrating WordNet knowledge to supplement training data in semi-supervised agglomerative hierarchical clustering for text categorization [J].
Benkhalifa, M ;
Mouradi, A ;
Bouyakhf, H .
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2001, 16 (08) :929-947
[3]  
El-Halees A., 2007, ISLAMIC U J, V15, P157
[4]  
ELHALEES A, 2006, P 1 INT C M IN PRESS, P15
[5]  
ELKOURDI M, 2004, 20 INT C COMP LING A
[6]  
Guo GD, 2004, LECT NOTES COMPUT SC, V2945, P559
[7]  
Hammo B., 2002, WORKSH COMP APPR SEM, P55
[8]  
JOACHIMS T, 1998, P EUR C MACH LEARN E, P173
[9]  
KHREISAT L, 2006, ARABIC TEXT CLASSIFI, P78
[10]  
MOULINIER I, 1996, P 5 ANN S DOC AN INF