Feature Selection based on Supervised Topic Modeling for Boosting-Based Multi-Label Text Categorization

被引:0
|
作者
Al-Salemi, Bassam [1 ]
Ayob, Masri [1 ]
Noah, Shahrul Azman Mohd [1 ]
Ab Aziz, Mohd Juzaiddin [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Bangi, Malaysia
来源
PROCEEDINGS OF THE 2017 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS (ICEEI'17) | 2017年
关键词
AdaBoost.MH; feature selection; text categorization; supervised topic modeling; Latent Dirichlet Allocation; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The text representation model Bag-Of-Words is a simple and typical model which uses the single words as elements to represent the texts in the feature space. However, using the single words as features will produce a high dimensional feature space, which result in the learning computational cost, particularly for ensemble learning algorithms, such as the boosting algorithm AdaBoost.MH. The straightforward solution of this matter can be managed by using a feature selection method capable of reducing the features space effectively. This work describes how to utilize the supervised topic model Labeled Latent Dirichlet Allocation for feature selection, as well accelerating AdaBoost.MH learning for multi-label text categorization. The experimental results on three benchmarks demonstrated that using Labeled Latent Dirichlet Allocation for feature selection improves and accelerates AdaBoost.MH and exceeds the performance of three existing methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [41] A robust graph based multi-label feature selection considering feature-label dependency
    Yunfei Liu
    Hongmei Chen
    Tianrui Li
    Weiyi Li
    Applied Intelligence, 2023, 53 : 837 - 863
  • [42] Semi-Supervised Multi-Label Feature Selection by Preserving Feature-Label Space Consistency
    Xu, Yuanyuan
    Wang, Jun
    An, Shuai
    Wei, Jinmao
    Ruan, Jianhua
    CIKM'18: PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, 2018, : 783 - 792
  • [43] A Multi-instance Multi-label Learning Algorithm Based on Feature Selection
    Chen Tong-tong
    Liu Chan-juan
    Zou Hai-lin
    Shen Qian
    Liu Ying
    Ding Xin-miao
    2015 10TH INTERNATIONAL CONFERENCE ON BROADBAND AND WIRELESS COMPUTING, COMMUNICATION AND APPLICATIONS (BWCCA 2015), 2015, : 587 - 590
  • [44] An evolutionary decomposition-based multi-objective feature selection for multi-label classification
    Bidgoli, Azam Asilian
    Ebrahimpour-Komleh, Hossein
    Rahnamayan, Shahryar
    PEERJ COMPUTER SCIENCE, 2020, 2020 (03) : 1 - 32
  • [45] Ensemble multi-label text categorization based on rotation forest and latent semantic indexing
    Elghazel, Haytham
    Aussem, Alex
    Gharroudi, Ouadie
    Saadaoui, Wafa
    EXPERT SYSTEMS WITH APPLICATIONS, 2016, 57 : 1 - 11
  • [46] A multi-label feature selection method based on an approximation of interaction information
    Pan, Minlan
    Sun, Zhanquan
    Wang, Chaoli
    Cao, Gaoyu
    INTELLIGENT DATA ANALYSIS, 2022, 26 (04) : 823 - 840
  • [47] Multi-label feature selection based on HSIC and sparrow search algorithm
    Wang, Tinghua
    Zhou, Huiying
    Liu, Hanming
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2023, 20 (08) : 14201 - 14221
  • [48] Multi-label feature selection based on manifold regularization and imbalance ratio
    Haohan Lu
    Hongmei Chen
    Tianrui Li
    Hao Chen
    Chuan Luo
    Applied Intelligence, 2022, 52 : 11652 - 11671
  • [49] TOPSIS-ACO based feature selection for multi-label classification
    Verma G.
    Sahu T.P.
    International Journal of Computers and Applications, 2024, 46 (06) : 363 - 380
  • [50] Multi-label causal feature selection based on neighbourhood mutual information
    Jie Wang
    Yaojin Lin
    Longzhu Li
    Yun-an Wang
    Meiyan Xu
    Jinkun Chen
    International Journal of Machine Learning and Cybernetics, 2022, 13 : 3509 - 3522