A Combination based on OWA Operators for Multi-label Genre Classification of web pages

被引:0
作者
Jebari, Chaker [1 ]
机构
[1] Coll Appl Sci, POB 14,PC 516, Muscat, Oman
来源
PROCESAMIENTO DEL LENGUAJE NATURAL | 2015年 / 54期
关键词
OWA; combination; multi-label; classifier; genre; web page;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
This paper presents a new method for genre identification that combines homogeneous classifiers using OWA (Ordered Weighted Averaging) operators. Our method uses character n-grams extracted from different information sources such as URL, title, headings and anchors. To deal with the complexity of web pages, we applied MLKNN as a multi-label classifier, in which a web page can be affected by more than one genre. Experiments conducted using a known multi-label corpus show that our method achieves good results.
引用
收藏
页码:13 / 20
页数:8
相关论文
共 29 条
[1]  
Abramson M., 2012, WORKSH 26 AAAI C ART
[2]  
Beliakov G., 2007, AGGREGATION FUNCTION
[3]  
BERNERSLEE T, 1998, RFC2396 UNIFORM RESO
[4]  
Boese E., 2005, P 20 NAT C ART INT A
[5]  
Crowston K, 1997, P ANN HICSS, P30
[6]  
Jebari C, 2008, THESIS
[7]  
Jebari C., 2004, P INT C COMP INT TUR
[8]   Learning to recognize webpage genres [J].
Kanaris, Ioannis ;
Stamatatos, Efstathios .
INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (05) :499-512
[9]  
Kang H. J., 1995, P 14 INT JOINT C ART
[10]  
Kennedy Allistair, 2005, P 38 ANN HAW INT C S