Text Classification Algorithms: A Survey

被引:738
|
作者
Kowsari, Kamran [1 ,2 ]
Meimandi, Kiana Jafari [1 ]
Heidarysafa, Mojtaba [1 ]
Mendu, Sanjana [1 ]
Barnes, Laura [1 ,2 ,3 ]
Brown, Donald [1 ,3 ]
机构
[1] Univ Virginia, Dept Syst & Informat Engn, Charlottesville, VA 22904 USA
[2] Univ Virginia, Sensing Syst Hlth Lab, Charlottesville, VA 22911 USA
[3] Univ Virginia, Sch Data Sci, Charlottesville, VA 22904 USA
关键词
text classification; text mining; text representation; text categorization; text analysis; document classification; ROC CURVE; DIMENSIONALITY REDUCTION; LOGISTIC-REGRESSION; COMPONENT ANALYSIS; NEURAL-NETWORK; BAYES THEOREM; NAIVE BAYES; AREA; MODELS; TREE;
D O I
10.3390/info10040150
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In recent years, there has been an exponential growth in the number of complex documents and texts that require a deeper understanding of machine learning methods to be able to accurately classify texts in many applications. Many machine learning approaches have achieved surpassing results in natural language processing. The success of these learning algorithms relies on their capacity to understand complex models and non-linear relationships within data. However, finding suitable structures, architectures, and techniques for text classification is a challenge for researchers. In this paper, a brief overview of text classification algorithms is discussed. This overview covers different text feature extractions, dimensionality reduction methods, existing algorithms and techniques, and evaluations methods. Finally, the limitations of each technique and their application in real-world problems are discussed.
引用
收藏
页数:68
相关论文
共 50 条
  • [1] A Survey on Text Classification Algorithms: From Text to Predictions
    Gasparetto, Andrea
    Marcuzzo, Matteo
    Zangari, Alessandro
    Albarelli, Andrea
    INFORMATION, 2022, 13 (02)
  • [2] A Comprehensive Study of Text Classification Algorithms
    Vijayan, Vikas K.
    Bindu, K. R.
    Parameswaran, Latha
    2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 1109 - 1113
  • [3] A Comparative Study of Parametric Versus Non-Parametric Text Classification Algorithms
    Chistol, Mihaela
    2020 15TH INTERNATIONAL CONFERENCE ON DEVELOPMENT AND APPLICATION SYSTEMS (DAS), 2020, : 208 - 213
  • [4] A survey on text classification and its applications
    Zhou, Xujuan
    Gururajan, Raj
    Li, Yuefeng
    Venkataraman, Revathi
    Tao, Xiaohui
    Bargshady, Ghazal
    Barua, Prabal D.
    Kondalsamy-Chennakesavan, Srinivas
    WEB INTELLIGENCE, 2020, 18 (03) : 205 - 216
  • [5] A Review of Machine Learning Algorithms for Text Classification
    Li, Ruiguang
    Liu, Ming
    Xu, Dawei
    Gao, Jiaqi
    Wu, Fudong
    Zhu, Liehuang
    CYBER SECURITY, CNCERT 2021, 2022, 1506 : 226 - 234
  • [6] DATA MINING CLASSIFICATION ALGORITHMS: A SURVEY
    Mohamed, Saouabi
    Abdellah, Ezzati
    INTERNATIONAL JOURNAL OF SECURITY AND ITS APPLICATIONS, 2021, 15 (01): : 45 - 50
  • [7] Feature Selection For Text Classification Using Genetic Algorithms
    Bidi, Noria
    Elberrichi, Zakaria
    PROCEEDINGS OF 2016 8TH INTERNATIONAL CONFERENCE ON MODELLING, IDENTIFICATION & CONTROL (ICMIC 2016), 2016, : 806 - 810
  • [8] Preferential text classification: learning algorithms and evaluation measures
    Fabio Aiolli
    Riccardo Cardin
    Fabrizio Sebastiani
    Alessandro Sperduti
    Information Retrieval, 2009, 12 : 559 - 580
  • [9] Preferential text classification: learning algorithms and evaluation measures
    Aiolli, Fabio
    Cardin, Riccardo
    Sebastiani, Fabrizio
    Sperduti, Alessandro
    INFORMATION RETRIEVAL, 2009, 12 (05): : 559 - 580
  • [10] Short Text Clustering Algorithms, Application and Challenges: A Survey
    Ahmed, Majid Hameed
    Tiun, Sabrina
    Omar, Nazlia
    Sani, Nor Samsiah
    APPLIED SCIENCES-BASEL, 2023, 13 (01):