A fuzzy classification based on feature selection for web pages

被引:2
作者
Zhang, MY [1 ]
Lu, ZD [1 ]
机构
[1] Huazhong Univ Sci & Technol, Dept Comp Sci & Technol, Wuhan 430074, Peoples R China
来源
IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE (WI 2004), PROCEEDINGS | 2004年
关键词
D O I
10.1109/WI.2004.10063
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An automatic web page classification is needed for web information extraction, but the number of keywords of web pages is so giant that many classifications are not speedy or capable of self-learning. In this paper a fuzzy classification method for web pages, which is based on fuzzy learning and parallel feature selection, is proposed. Fuzzy learning of parameter c(ik) is adopted to increase the accuracy, while parallel feature selection based on weighted similarity is used not only to decrease the dimension of the features but also to let parameter sigma(ik) need no learning. The weights of features are deducted in theory, and to speed up the calculation of weights, a parallel sum algorithm of the matrix is proposed.
引用
收藏
页码:469 / 472
页数:4
相关论文
共 9 条
[1]   AUTOMATED LEARNING OF DECISION RULES FOR TEXT CATEGORIZATION [J].
APTE, C ;
DAMERAU, F ;
WEISS, SM .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 1994, 12 (03) :233-251
[2]   Unsupervised feature selection using a neuro-fuzzy approach [J].
Basak, J ;
De, RK ;
Pal, SK .
PATTERN RECOGNITION LETTERS, 1998, 19 (11) :997-1006
[3]  
Fan Yan, 2001, Journal of Software, V12, P1386
[4]  
Li Y, 2003, IEEE IJCNN, P3223
[5]  
LU B, 2003, 2003 P INT JOINT C N, P759
[6]  
QUINN MJ, 1994, PARALLEL COMPUTING T, P32
[7]  
Salton G., 1988, Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer
[8]  
Schenker A, 2003, PROC INT CONF DOC, P240
[9]  
Selamat A, 2003, IEEE IJCNN, P1792