Automatic text classification using machine learning and optimization algorithms

被引:15
|
作者
Janani, R. [1 ]
Vijayarani, S. [1 ]
机构
[1] Bharathiar Univ, Dept Comp Sci, Coimbatore, Tamil Nadu, India
关键词
Text mining; Information retrieval; Document classification; Content analysis; Feature selection; Bio-inspired algorithms; PSO; ACO; ABC; FA; OTFS algorithm; Machine learning algorithms; NB; KNN; SVM; PNN; MLearn-ATC; DOCUMENTS;
D O I
10.1007/s00500-020-05209-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the recent years, the volume of text documents in the form of digital way has grown up extremely in size. As significance, there is a need to be competent to automatically bring together and classify the documents based on their content. The main goal of text classification is to partition the unstructured set of documents into their respective categories based on its content. The main aim of this research work is to automatically classify the documents which are stored in the personal computer into their relevant categories. This work has two significant phases. In the first phase, the important features are selected for classification and the second phase is the classification of text documents. For selecting the optimal features, this research work proposes a new algorithm, optimization technique for feature selection (OTFS) algorithm. To estimate the proficiency of proposed feature selection algorithm, the OTFS algorithm was compared with the existing approaches artificial bee colony, firefly algorithm, ant colony optimization and particle swarm optimization. In the second phase, this research work proposed machine learning-based automatic text classification (MLearn-ATC) algorithm for text classification. In classification, the MLearn-ATC algorithm was compared with widely used classification techniques probabilistic neural network, support vector machine, K-nearest neighbor and Naive Bayes. From this, the output of first phase is used as the input for classification phase. The decisive results establish that the proposed algorithms achieve the better accuracy for optimizing the features and classifying the text documents based on their content.
引用
收藏
页码:1129 / 1145
页数:17
相关论文
共 50 条
  • [1] Retraction Note: Automatic text classification using machine learning and optimization algorithms
    R. Janani
    S. Vijayarani
    Soft Computing, 2024, 28 (Suppl 2) : 831 - 831
  • [2] Text Message Classification Using Supervised Machine Learning Algorithms
    Merugu, Suresh
    Reddy, M. Chandra Shekhar
    Goyal, Ekansh
    Piplani, Lakshay
    ICCCE 2018, 2019, 500 : 141 - 150
  • [3] Automatic Classification of Vulnerabilities using Deep Learning and Machine Learning Algorithms
    Ramesh, Vishnu
    Abraham, Sara
    Vinod, P.
    Mohamed, Isham
    Visaggio, Corrado A.
    Laudanna, Sonia
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [4] Automatic Classification of Lung Sounds Using Machine Learning Algorithms
    Ullah, Ahmad
    Khan, Misha Urooj
    Mujahid, Farrukh
    Khan, Muhammad Salman
    2021 INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY (FIT 2021), 2021, : 131 - 136
  • [5] An Automatic Flower Classification Approach Using Machine Learning Algorithms
    Zawbaa, Hossam M.
    Abbass, Mona
    Basha, Sameh H.
    Hazman, Maryam
    Hassenian, Abul Ella
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 895 - 901
  • [6] A Review of Machine Learning Algorithms for Text Classification
    Li, Ruiguang
    Liu, Ming
    Xu, Dawei
    Gao, Jiaqi
    Wu, Fudong
    Zhu, Liehuang
    CYBER SECURITY, CNCERT 2021, 2022, 1506 : 226 - 234
  • [7] ROP and TOB optimization using machine learning classification algorithms
    Oyedere, Mayowa
    Gray, Ken
    JOURNAL OF NATURAL GAS SCIENCE AND ENGINEERING, 2020, 77
  • [8] Machine learning algorithms in Arabic Text Classification: A Review
    Aboalnaser, Sara A.
    12TH INTERNATIONAL CONFERENCE ON THE DEVELOPMENTS IN ESYSTEMS ENGINEERING (DESE 2019), 2019, : 290 - 295
  • [9] Machine Learning Algorithms for Automatic Classification of Marmoset Vocalizations
    Turesson, Hjalmar K.
    Ribeiro, Sidarta
    Pereira, Danillo R.
    Papa, Joao P.
    de Albuquerque, Victor Hugo C.
    PLOS ONE, 2016, 11 (09):
  • [10] AUTOMATIC CLASSIFICATION OF UNEQUAL LEXICAL STRESS PATTERNS USING MACHINE LEARNING ALGORITHMS
    Shahin, Mostafa Ali
    Ahmed, Beena
    Ballard, Kirrie J.
    2012 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2012), 2012, : 388 - 391