AdaBoost ranking results improvement by pairwise classifiers for web page classification

被引:0
作者
Ga̧ciarz T. [1 ]
Czajkowski K. [1 ]
Niebylski M. [1 ]
机构
[1] Institute of Telecomputing, Faculty of Physics, Mathematics and Computer Science, Cracow University of Technology
来源
Advances in Intelligent and Soft Computing | 2011年 / 103卷
关键词
AdaBoost; classification; features extraction; Web page;
D O I
10.1007/978-3-642-23169-8_43
中图分类号
学科分类号
摘要
The article concerns the analysis of information describing the web pages. The aim of the analysis is to support the process of their classification. Pages belonging to the specific class are characterized by the similar 'style' in terms of the form or the type of content presentation. Various characteristics are taken into account including inter alia, structural, visual, text, web and links features. During the construction of classifiers the AdaBoost algorithm was applied to create ranking list of classifiers. Then the pairwise classifiers were used to improve final classification. The paper presents the implementation of this solution and the results of experiments. © 2011 Springer-Verlag Berlin Heidelberg.
引用
收藏
页码:393 / 400
页数:7
相关论文
共 12 条
[1]  
Czajkowski K., Decision rules and databases in web pages classification, Studia Informatica, 30, 2 A83, pp. 355-372, (2009)
[2]  
Dong L., Watters C., Duffy J., Shepherd M., An examination of genre attributes for web page classification, Proceedings of the 41st Annual Hawaii International Conference on System Sciences (HICSS), pp. 133-133, (2008)
[3]  
Meyer Zu Eissen S., Stein B., Genre classification of web pages, KI 2004, 3238, pp. 256-269, (2004)
[4]  
Freund Y., Schapire R.E., A decision-theoretic generalization of on-line learning and an application to boosting, Proceedings of the 2nd European Conference on Computational Learning Theory, pp. 23-37, (1995)
[5]  
Holden N., Freitas A.A., Web page classification with an ant colony algorithm, PPSN 2004, 3242, pp. 1092-1102, (2004)
[6]  
Milkowski M.
[7]  
Santini M., Some issues in automatic genre classification of web pages, Proceedings of JADT, (2008)
[8]  
Shepherd M., Watters C., Identifying web genre: Hitting a moving target, Proceedings of theWWW2004 Conference Workshop on Measuring Web Search Effectiveness: The User Perspective, (2004)
[9]  
Tsukada M., Washio T., Motoda H., Automatic web-page classification by using machine learning methods, WI 2001, 2198, pp. 303-313, (2001)
[10]  
Xhemalt D., Hinde C., Stone R., Naive bayes vs. Decision trees vs. Neural networks in the classification of training web pages, International Journal of Computer Science Issues, 4, 1, pp. 16-23, (2009)