Automatic text classification using machine learning and optimization algorithms

被引:16
|
作者
Janani, R. [1 ]
Vijayarani, S. [1 ]
机构
[1] Bharathiar Univ, Dept Comp Sci, Coimbatore, Tamil Nadu, India
关键词
Text mining; Information retrieval; Document classification; Content analysis; Feature selection; Bio-inspired algorithms; PSO; ACO; ABC; FA; OTFS algorithm; Machine learning algorithms; NB; KNN; SVM; PNN; MLearn-ATC; DOCUMENTS;
D O I
10.1007/s00500-020-05209-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the recent years, the volume of text documents in the form of digital way has grown up extremely in size. As significance, there is a need to be competent to automatically bring together and classify the documents based on their content. The main goal of text classification is to partition the unstructured set of documents into their respective categories based on its content. The main aim of this research work is to automatically classify the documents which are stored in the personal computer into their relevant categories. This work has two significant phases. In the first phase, the important features are selected for classification and the second phase is the classification of text documents. For selecting the optimal features, this research work proposes a new algorithm, optimization technique for feature selection (OTFS) algorithm. To estimate the proficiency of proposed feature selection algorithm, the OTFS algorithm was compared with the existing approaches artificial bee colony, firefly algorithm, ant colony optimization and particle swarm optimization. In the second phase, this research work proposed machine learning-based automatic text classification (MLearn-ATC) algorithm for text classification. In classification, the MLearn-ATC algorithm was compared with widely used classification techniques probabilistic neural network, support vector machine, K-nearest neighbor and Naive Bayes. From this, the output of first phase is used as the input for classification phase. The decisive results establish that the proposed algorithms achieve the better accuracy for optimizing the features and classifying the text documents based on their content.
引用
收藏
页码:1129 / 1145
页数:17
相关论文
共 50 条
  • [1] Text Message Classification Using Supervised Machine Learning Algorithms
    Merugu, Suresh
    Reddy, M. Chandra Shekhar
    Goyal, Ekansh
    Piplani, Lakshay
    ICCCE 2018, 2019, 500 : 141 - 150
  • [2] Stemming Text-based Web Page Classification using Machine Learning Algorithms: A Comparison
    Razali, Ansari
    Daud, Salwani Mohd
    Zin, Nor Azan Mat
    Shahidi, Faezehsadat
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (01) : 570 - 576
  • [3] Automatic oral cancer detection and classification using modified local texture descriptor and machine learning algorithms
    Yaduvanshi V.
    Murugan R.
    Goel T.
    Multimedia Tools and Applications, 2025, 84 (2) : 1031 - 1055
  • [4] Text Classification Using Machine Learning Methods-A Survey
    Agarwal, Basant
    Mittal, Namita
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON SOFT COMPUTING FOR PROBLEM SOLVING (SOCPROS 2012), 2014, 236 : 701 - 709
  • [5] Efficient English text classification using selected Machine Learning Techniques
    Luo, Xiaoyu
    ALEXANDRIA ENGINEERING JOURNAL, 2021, 60 (03) : 3401 - 3409
  • [6] Text Classification for Azerbaijani Language Using Machine Learning
    Suleymanov, Umid
    Kalejahi, Behnam Kiani
    Amrahov, Elkhan
    Badirkhanli, Rashid
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (06): : 467 - 475
  • [7] Automatic Classification for Cognitive Engagement in Online Discussion Forums: Text Mining and Machine Learning Approach
    Hayati, Hind
    Idrissi, Mohammed Khalidi
    Bennani, Samir
    ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2020), PT II, 2020, 12164 : 114 - 118
  • [8] Heart sound classification using signal processing and machine learning algorithms
    Zeinali, Yasser
    Niaki, Seyed Taghi Akhavan
    MACHINE LEARNING WITH APPLICATIONS, 2022, 7
  • [9] Classification of SSH Attacks using Machine Learning Algorithms
    Sadasivam, Gokul Kannan
    Hota, Chittaranjan
    Anand, Bhojan
    2016 6TH INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS 2016), 2016, : 260 - 265
  • [10] Classification of Swallowing Foods Using Machine Learning Algorithms
    Lim, Ji Hyun
    Djuric, Petar M.
    Stanacevic, Milutin
    INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021), 2021, : 1571 - 1574