Feature Selection based on Supervised Topic Modeling for Boosting-Based Multi-Label Text Categorization

被引:0
|
作者
Al-Salemi, Bassam [1 ]
Ayob, Masri [1 ]
Noah, Shahrul Azman Mohd [1 ]
Ab Aziz, Mohd Juzaiddin [1 ]
机构
[1] Univ Kebangsaan Malaysia, Fac Informat Sci & Technol, Bangi, Malaysia
来源
PROCEEDINGS OF THE 2017 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND INFORMATICS (ICEEI'17) | 2017年
关键词
AdaBoost.MH; feature selection; text categorization; supervised topic modeling; Latent Dirichlet Allocation; ALGORITHM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The text representation model Bag-Of-Words is a simple and typical model which uses the single words as elements to represent the texts in the feature space. However, using the single words as features will produce a high dimensional feature space, which result in the learning computational cost, particularly for ensemble learning algorithms, such as the boosting algorithm AdaBoost.MH. The straightforward solution of this matter can be managed by using a feature selection method capable of reducing the features space effectively. This work describes how to utilize the supervised topic model Labeled Latent Dirichlet Allocation for feature selection, as well accelerating AdaBoost.MH learning for multi-label text categorization. The experimental results on three benchmarks demonstrated that using Labeled Latent Dirichlet Allocation for feature selection improves and accelerates AdaBoost.MH and exceeds the performance of three existing methods.
引用
收藏
页数:6
相关论文
共 50 条
  • [31] Feature subset selection in SOM based text categorization
    Bassiouny, S
    Nagi, M
    Hussein, MF
    IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 860 - 866
  • [32] Self-expression multi-label feature selection based on fuzzy decision
    Pei, Shibing
    Chen, Minghao
    Wang, Changzhong
    APPLIED SOFT COMPUTING, 2025, 175
  • [33] A Multi-label Filter Feature Selection Method Based on Approximate Pareto Dominance
    Zhou, Jian
    Guo, Yinnong
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (07) : 127 - 133
  • [34] Neighborhood rough set based multi-label feature selection with label correlation
    Wu, Yilin
    Liu, Jinghua
    Yu, Xiehua
    Lin, Yaojin
    Li, Shaozi
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (22)
  • [35] Label Clustering for Particle Swarm Optimisation based Multi-Label Feature Selection
    Lu, Yan
    Nguyen, Bach Hoai
    Xue, Bing
    2022 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2022, : 1515 - 1522
  • [36] A Label Correlation Based Weighting Feature Selection Approach for Multi-label Data
    Liu, Lu
    Zhang, Jing
    Li, Peipei
    Zhang, Yuhong
    Hu, Xuegang
    WEB-AGE INFORMATION MANAGEMENT, PT II, 2016, 9659 : 369 - 379
  • [37] A robust multi-label feature selection based on label significance and fuzzy entropy
    Yang, Taoli (yangtaoli2019@163.com), 1600, Elsevier Inc. (176):
  • [38] A robust multi-label feature selection based on label significance and fuzzy entropy
    Yang, Taoli
    Wang, Changzhong
    Chen, Yiying
    Deng, Tingquan
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2025, 176
  • [39] Multi-label feature selection based on label distribution and neighborhood rough set
    Liu, Jinghua
    Lin, Yaojin
    Ding, Weiping
    Zhang, Hongbo
    Wang, Cheng
    Du, Jixiang
    NEUROCOMPUTING, 2023, 524 : 142 - 157
  • [40] Temporal-based Feature Selection and Transfer Learning for Text Categorization
    Fukumoto, Fumiyo
    Suzuki, Yoshimi
    2015 7TH INTERNATIONAL JOINT CONFERENCE ON KNOWLEDGE DISCOVERY, KNOWLEDGE ENGINEERING AND KNOWLEDGE MANAGEMENT (IC3K), 2015, : 17 - 26