Feature Selection based on Supervised Topic Modeling for Boosting-Based Multi-Label Text Categorization

被引:0
|
作者
Al-Salemi, Bassam [1 ]
Ayob, Masri [1 ]
Noah, Shahrul Azman Mohd [1 ]
Ab Aziz, Mohd Juzaiddin [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Bangi, Malaysia
来源
PROCEEDINGS OF THE 2017 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS (ICEEI'17) | 2017年
关键词
AdaBoost.MH; feature selection; text categorization; supervised topic modeling; Latent Dirichlet Allocation; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The text representation model Bag-Of-Words is a simple and typical model which uses the single words as elements to represent the texts in the feature space. However, using the single words as features will produce a high dimensional feature space, which result in the learning computational cost, particularly for ensemble learning algorithms, such as the boosting algorithm AdaBoost.MH. The straightforward solution of this matter can be managed by using a feature selection method capable of reducing the features space effectively. This work describes how to utilize the supervised topic model Labeled Latent Dirichlet Allocation for feature selection, as well accelerating AdaBoost.MH learning for multi-label text categorization. The experimental results on three benchmarks demonstrated that using Labeled Latent Dirichlet Allocation for feature selection improves and accelerates AdaBoost.MH and exceeds the performance of three existing methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [21] Compact feature subset-based multi-label music categorization for mobile devices
    Lee, Jaesung
    Seo, Wangduk
    Park, Jin-Hyeong
    Kim, Dae-Won
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (04) : 4869 - 4883
  • [22] Feature Selection Method Based on Crossed Centroid for Text Categorization
    Yang, Jieming
    Liu, Zhiying
    Qu, Zhaoyang
    Wang, Jing
    2014 15TH IEEE/ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING AND PARALLEL/DISTRIBUTED COMPUTING (SNPD), 2014, : 11 - 15
  • [23] Mutual information-based label distribution feature selection for multi-label learning
    Qian, Wenbin
    Huang, Jintao
    Wang, Yinglong
    Shu, Wenhao
    KNOWLEDGE-BASED SYSTEMS, 2020, 195
  • [24] Label correlations-based multi-label feature selection with label enhancement
    Qian, Wenbin
    Xiong, Yinsong
    Ding, Weiping
    Huang, Jintao
    Vong, Chi-Man
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 127
  • [25] Multi-label feature selection based on rough granular-ball and label distribution
    Qian, Wenbin
    Xu, Fankang
    Qian, Jin
    Shu, Wenhao
    Ding, Weiping
    INFORMATION SCIENCES, 2023, 650
  • [26] Toward embedding-based multi-label feature selection with label and feature collaboration
    Liang Dai
    Jia Zhang
    Guodong Du
    Candong Li
    Rong Wei
    Shaozi Li
    Neural Computing and Applications, 2023, 35 : 4643 - 4665
  • [27] Feature selection based on feature interactions with application to text categorization
    Tang, Xiaochuan
    Dai, Yuanshun
    Xiang, Yanping
    EXPERT SYSTEMS WITH APPLICATIONS, 2019, 120 : 207 - 216
  • [28] Multi-label feature selection method based on dynamic weight
    Zhang, Ping
    Sheng, Jiyao
    Gao, Wanfu
    Hu, Juncheng
    Li, Yonghao
    SOFT COMPUTING, 2022, 26 (06) : 2793 - 2805
  • [29] Multi-label feature selection based on neighborhood mutual information
    Lin, Yaojin
    Hu, Qinghua
    Liu, Jinghua
    Chen, Jinkun
    Duan, Jie
    APPLIED SOFT COMPUTING, 2016, 38 : 244 - 256
  • [30] Granular multi-label feature selection based on mutual information
    Li, Feng
    Miao, Duoqian
    Pedrycz, Witold
    PATTERN RECOGNITION, 2017, 67 : 410 - 423