Web Page Classification with Social Annotations

被引:0
作者
Zubiaga, Arkaitz [1 ]
Martinez, Raquel [1 ]
Fresno, Victor [1 ]
机构
[1] Univ Nacl Educac Distan, C-Juan Rosal,16, Madrid 20840, Spain
来源
PROCESAMIENTO DEL LENGUAJE NATURAL | 2009年 / 43期
关键词
web page classification; social annotations; social bookmarking;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
User-generated annotations on social bookmarking sites can provide interesting and promising metadata for web page classification. These annotations include diverse types of information, such as tags and comments. Nonetheless, each kind of annotation has a different nature and popularity level. In this work, we analyze and evaluate the usefulness of each of these social annotations to classify web pages over a taxonomy like that by the Open Directory Project. We compare them separately to the content-based classification, and also combine the different types of data. Our experiments show encouraging results with the use of social annotations for this purpose, and we found that combining these metadata with web page content improves even more the classifier's performance.
引用
收藏
页码:225 / 233
页数:9
相关论文
共 13 条
[1]  
Bao S., 2007, P INT C WORLD WID WE
[2]   Usage patterns of collaborative tagging systems [J].
Golder, SA ;
Huberman, BA .
JOURNAL OF INFORMATION SCIENCE, 2006, 32 (02) :198-208
[3]  
Heymann P., 2008, P INT C WEB SEARCH W, P195, DOI DOI 10.1145/1341531.1341558
[4]  
Hotho A, 2006, LECT NOTES COMPUT SC, V4011, P411
[5]  
Joachims T., 1998, Machine Learning: ECML-98. 10th European Conference on Machine Learning. Proceedings, P137, DOI 10.1007/BFb0026683
[6]  
Joachims T., 1999, MAKING LARGE SCALE S, P169
[7]  
Noll Michael G., 2008, Wl 2008. 2008 IEEE/WIC/ACM International Conference on Web Intelligence. IAT 2008. 2008 IEEE/WIC/ACM International Conference on Intelligent Agent Technology. Wl-IAT Workshop 2008 2008 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology Workshops, P640, DOI 10.1109/WIIAT.2008.341
[8]  
Noll MG, 2008, APPLIED COMPUTING 2008, VOLS 1-3, P2315
[9]  
Noll MG, 2007, DOCENG'07: PROCEEDINGS OF THE 2007 ACM SYMPOSIUM ON DOCUMENT ENGINEERING, P177
[10]  
Ramage D., 2009, P 2 ACM INT C WEB SE