An intelligent web-page classifier with fair feature-subset selection

被引:7
作者
Chen, Chih-Ming
Lee, Hahn-Ming
Tan, Chia-Chen
机构
[1] Natl Chengchi Univ, Grad Inst Lib Informat & Archival Studies, Taipei 116, Taiwan
[2] Natl Taiwan Univ Sci & Technol, Dept Comp Sci & Informat Engn, Taipei 106, Taiwan
关键词
feature selection; Web page classification; machine learning;
D O I
10.1016/j.engappai.2006.02.001
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The explosion of on-line information has given rise to many manually constructed topic hierarchies (such as Yahoo!!). But with the current growth rate in the amount of information, manual classification in topic hierarchies results in an immense information bottleneck. Therefore, developing an automatic classifier is an urgent need. However, classifiers suffer from enormous dimensionality, since the dimensionality is determined by the number of distinct keywords in a document corpus. More seriously, most classifiers are either working slowly or they are constructed subjectively without any learning ability. In this paper, we address these problems with a fair feature-subset selection (FFSS) algorithm and an adaptive fuzzy learning network (AFLN) for classification. The FFSS algorithm is used to reduce the enormous dimensionality. It not only gives fair treatment to each category but also has ability to identify useful features, including both positive and negative features. On the other hand, the AFLN provides extremely fast learning ability to model the uncertain behavior for classification so as to correct the fuzzy matrix automatically. Experimental results show that both FFSS algorithm and the AFLN lead to a significant improvement in document classification, compared to alternative approaches. (c) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:967 / 978
页数:12
相关论文
共 19 条
  • [1] Classifier and feature set ensembles for web page classification
    Onan, Aytug
    JOURNAL OF INFORMATION SCIENCE, 2016, 42 (02) : 150 - 165
  • [2] A Comparative Study of Feature-Ranking and Feature-Subset Selection Techniques for Improved Fault Prediction
    Rathore, Santosh Singh
    Gupta, Atul
    PROCEEDINGS OF THE 7TH INDIA SOFTWARE ENGINEERING CONFERENCE 2014, ISEC '14, 2014,
  • [3] A web page classification algorithm based on feature selection
    Zhou, Hongfang
    Guo, Jie
    Wang, Xinyi
    Duan, Wencong
    Wang, Peng
    Cao, Wenquan
    Journal of Information and Computational Science, 2015, 12 (04): : 1549 - 1556
  • [4] Support Vector Machine Ensembles Using Feature-Subset Selection for Enhancing Microarray Data Classification
    Ahmed, Eman
    El Gayar, Neamat
    El Azab, Iman A.
    INTERNATIONAL JOURNAL OF APPLIED MATHEMATICS & STATISTICS, 2012, 28 (04): : 1 - 11
  • [5] A Novel Feature Selection Framework for Automatic Web Page Classification
    J.Alamelu Mangai
    V.Santhosh Kumar
    S.Appavu alias Balamurugan
    International Journal of Automation and Computing, 2012, (04) : 442 - 448
  • [6] Two novel feature selection approaches for web page classification
    Chen, Chih-Ming
    Lee, Hahn-Ming
    Chang, Yu-Jung
    EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (01) : 260 - 272
  • [7] A Novel Feature Selection Framework for Automatic Web Page Classification
    Mangai, J. Alamelu
    Kumar, V. Santhosh
    Balamurugan, S. Appavu Alias
    INTERNATIONAL JOURNAL OF AUTOMATION AND COMPUTING, 2012, 9 (04) : 442 - 448
  • [8] Fair Feature Subset Selection using Multiobjective Genetic Algorithm
    Rehman, Ayaz Ur
    Nadeem, Anas
    Malik, Muhammad Zubair
    PROCEEDINGS OF THE 2022 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE COMPANION, GECCO 2022, 2022, : 360 - 363
  • [9] A New Approach of Feature Selection for Chinese Web Page Categorization
    Li, Cunhe
    Zhu, Lina
    Liu, Kangwei
    ADVANCES IN COMPUTATION AND INTELLIGENCE, PROCEEDINGS, 2008, 5370 : 386 - 395
  • [10] AutoModeling: Integrated Approach for Automated Model Generation by Ensemble Selection of Feature Subset and Classifier
    Ukil, Arijit
    Sahu, Ishan
    Puri, Chetanya
    Mukherjee, Ayan
    Singh, Rituraj
    Bandyopadhyay, Soma
    Pal, Arpan
    2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018,