An Ensemble Framework for Text Classification

被引:0
|
作者
Kamateri, Eleni [1 ]
Salampasis, Michail [1 ]
机构
[1] Int Hellen Univ, Dept Informat & Elect Engn, Alexander Campus,POB 141, Thessaloniki 57400, Greece
关键词
ensemble learning; ensemble framework; text classification; patent classification; NEURAL-NETWORKS;
D O I
10.3390/info16020085
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Ensemble learning can improve predictive performance compared to the performance of any of its constituents alone, while keeping computational demands manageable. However, no reference methodology is available for developing ensemble systems. In this paper, we adapt an ensemble framework for patent classification to assist data scientists in creating flexible ensemble architectures for text classification by selecting a finite set of constituent base models from the many available alternatives. We analyze the axes along which someone can select base models of an ensemble system and propose a methodology for combining them. Moreover, we conduct experiments to compare the effectiveness of ensemble systems against base models and state-of-the-art methods on multiple datasets (three patent classification and two text classification datasets), including long and short texts and single- and/or multi-labeled texts. The results verify the generality of our framework and the effectiveness of ensemble systems, especially ensembles of classifiers trained on different data sections/metadata.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] A Classifier Ensemble Framework for Multimedia Big Data Classification
    Yan, Yilin
    Zhu, Qiusha
    Shyu, Mei-Ling
    Chen, Shu-Ching
    PROCEEDINGS OF 2016 IEEE 17TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI), 2016, : 615 - 622
  • [32] Topic selection for text classification using ensemble topic modeling with grouping, scoring, and modeling approach
    Voskergian, Daniel
    Jayousi, Rashid
    Yousef, Malik
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [33] Distributed Text Classification With an Ensemble Kernel-Based Learning Approach
    Silva, Catarina
    Lotric, Uros
    Ribeiro, Bernardete
    Dobnikar, Andrej
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2010, 40 (03): : 287 - 297
  • [34] Ensemble of Binary Classification for the Emotion Detection in Code-Switching Text
    Zhang, Xinghua
    Zhang, Chunyue
    Shi, Huaxing
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2018, PT II, 2018, 11109 : 178 - 189
  • [35] Robustness and Predictive Performance of Homogeneous Ensemble Feature Selection in Text Classification
    Mehta, Poornima
    Chandra, Satish
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2021, 11 (01) : 75 - 89
  • [36] Text Classification Based on Multilingual Back-Translation and Model Ensemble
    Song, Jinwang
    Zan, Hongying
    Liu, Tao
    Zhang, Kunli
    Ji, Xinmeng
    Cui, Tingting
    HEALTH INFORMATION PROCESSING: EVALUATION TRACK PAPERS, CHIP 2023, 2024, 2080 : 231 - 241
  • [37] A Framework for Explainable Text Classification in Legal Document Review
    Mahoney, Christian J.
    Zhang, Jianping
    Huber-Fliflet, Nathaniel
    Gronvall, Peter
    Zhao, Haozhen
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 1858 - 1867
  • [38] An Automated Text Document Classification Framework using BERT
    Shah, Momna Ali
    Iqbal, Muhammad Javed
    Noreen, Neelum
    Ahmed, Iftikhar
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (03) : 279 - 285
  • [39] An Ensemble Learning Strategy for Eligibility Criteria Text Classification for Clinical Trial Recruitment: Algorithm Development and Validation
    Zeng, Kun
    Pan, Zhiwei
    Xu, Yibin
    Qu, Yingying
    JMIR MEDICAL INFORMATICS, 2020, 8 (07)
  • [40] ChromEDA: Chromosome classification by ensemble framework based domain adaptation
    Zhang, Lin
    Fan, Xinyu
    Lin, Kunjie
    Qi, Ruikun
    Yi, Xianpeng
    Liu, Hui
    MICROSCOPY RESEARCH AND TECHNIQUE, 2024, 87 (04) : 832 - 843