An Ensemble Framework for Text Classification

被引:0
|
作者
Kamateri, Eleni [1 ]
Salampasis, Michail [1 ]
机构
[1] Int Hellen Univ, Dept Informat & Elect Engn, Alexander Campus,POB 141, Thessaloniki 57400, Greece
关键词
ensemble learning; ensemble framework; text classification; patent classification; NEURAL-NETWORKS;
D O I
10.3390/info16020085
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ensemble learning can improve predictive performance compared to the performance of any of its constituents alone, while keeping computational demands manageable. However, no reference methodology is available for developing ensemble systems. In this paper, we adapt an ensemble framework for patent classification to assist data scientists in creating flexible ensemble architectures for text classification by selecting a finite set of constituent base models from the many available alternatives. We analyze the axes along which someone can select base models of an ensemble system and propose a methodology for combining them. Moreover, we conduct experiments to compare the effectiveness of ensemble systems against base models and state-of-the-art methods on multiple datasets (three patent classification and two text classification datasets), including long and short texts and single- and/or multi-labeled texts. The results verify the generality of our framework and the effectiveness of ensemble systems, especially ensembles of classifiers trained on different data sections/metadata.
引用
收藏
页数:18
相关论文
共 50 条
  • [11] OBOE: an Explainable Text Classification Framework
    Escobar, Raul A. del aguila
    Suarez-Figueroa, Mari Carmen
    Fernandez-Lopez, Mariano
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2022,
  • [12] OBOE: an Explainable Text Classification Framework
    Escobar, Raul A. del Aguila
    Suarez-Figueroa, Mari Carmen
    Fernandez-Lopez, Mariano
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2024, 8 (06):
  • [13] Ensemble-Based Text Classification for Spam Detection
    Zhang X.
    Liu G.
    Zhang M.
    Informatica (Slovenia), 2024, 48 (06): : 71 - 80
  • [14] Vertical Ensemble Co-Training for Text Classification
    Katz, Gilad
    Caragea, Cornelia
    Shabtai, Asaf
    ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2018, 9 (02)
  • [15] Ensemble Filter-Wrapper Text Feature Selection Methods for Text Classification
    Ige, Oluwaseun Peter
    Gan, Keng Hoon
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2024, 141 (02): : 1847 - 1865
  • [16] Dynamic Ensemble Framework for Imbalanced Data Classification
    Zhu, Tuanfei
    Hu, Xingchen
    Liu, Xinwang
    Zhu, En
    Zhu, Xinzhong
    Xu, Huiying
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2025, 37 (05) : 2456 - 2471
  • [17] A Flower Classification Framework Based on Ensemble of CNNs
    Huang, Buzhen
    Hu, Youpeng
    Sun, Yaoqi
    Hao, Xinhong
    Yan, Chenggang
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT III, 2018, 11166 : 235 - 244
  • [18] Stacked Framework for Ensemble of Heterogeneous Classification Algorithms
    David, H. Benjamin Fredrick
    Suruliandi, A.
    Raja, S. P.
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2021, 30 (15)
  • [19] Genetic Algorithm and Ensemble Learning Aided Text Classification using Support Vector Machines
    Chauhan, Anshumaan
    Agarwal, Ayushi
    Sulthana, Razia
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (08) : 260 - 267
  • [20] Rough set and ensemble learning based semi-supervised algorithm for text classification
    Shi, Lei
    Ma, Xinming
    Xi, Lei
    Duan, Qiguo
    Zhao, Jingying
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) : 6300 - 6306