Automatic text classification using machine learning and optimization algorithms

被引:15
|
作者
Janani, R. [1 ]
Vijayarani, S. [1 ]
机构
[1] Bharathiar Univ, Dept Comp Sci, Coimbatore, Tamil Nadu, India
关键词
Text mining; Information retrieval; Document classification; Content analysis; Feature selection; Bio-inspired algorithms; PSO; ACO; ABC; FA; OTFS algorithm; Machine learning algorithms; NB; KNN; SVM; PNN; MLearn-ATC; DOCUMENTS;
D O I
10.1007/s00500-020-05209-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the recent years, the volume of text documents in the form of digital way has grown up extremely in size. As significance, there is a need to be competent to automatically bring together and classify the documents based on their content. The main goal of text classification is to partition the unstructured set of documents into their respective categories based on its content. The main aim of this research work is to automatically classify the documents which are stored in the personal computer into their relevant categories. This work has two significant phases. In the first phase, the important features are selected for classification and the second phase is the classification of text documents. For selecting the optimal features, this research work proposes a new algorithm, optimization technique for feature selection (OTFS) algorithm. To estimate the proficiency of proposed feature selection algorithm, the OTFS algorithm was compared with the existing approaches artificial bee colony, firefly algorithm, ant colony optimization and particle swarm optimization. In the second phase, this research work proposed machine learning-based automatic text classification (MLearn-ATC) algorithm for text classification. In classification, the MLearn-ATC algorithm was compared with widely used classification techniques probabilistic neural network, support vector machine, K-nearest neighbor and Naive Bayes. From this, the output of first phase is used as the input for classification phase. The decisive results establish that the proposed algorithms achieve the better accuracy for optimizing the features and classifying the text documents based on their content.
引用
收藏
页码:1129 / 1145
页数:17
相关论文
共 50 条
  • [41] Water Quality Classification Using Machine Learning Algorithms
    Alnaqeb, Reem
    Alketbi, Khuloud
    Alrashdi, Fatema
    Ismail, Heba
    2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2022,
  • [42] Classification of SSH Attacks using Machine Learning Algorithms
    Sadasivam, Gokul Kannan
    Hota, Chittaranjan
    Anand, Bhojan
    2016 6TH INTERNATIONAL CONFERENCE ON IT CONVERGENCE AND SECURITY (ICITCS 2016), 2016, : 260 - 265
  • [43] Protostellar classification using supervised machine learning algorithms
    O. Miettinen
    Astrophysics and Space Science, 2018, 363
  • [44] Water quality classification using machine learning algorithms
    Nasir, Nida
    Kansal, Afreen
    Alshaltone, Omar
    Barneih, Feras
    Sameer, Mustafa
    Shanableh, Abdallah
    Al-Shamma'a, Ahmed
    JOURNAL OF WATER PROCESS ENGINEERING, 2022, 48
  • [45] Classification of Customer Reviews Using Machine Learning Algorithms
    Noori, Behrooz
    APPLIED ARTIFICIAL INTELLIGENCE, 2021, 35 (08) : 567 - 588
  • [46] Classification of Logging Data Using Machine Learning Algorithms
    Mukhamediev, Ravil
    Kuchin, Yan
    Yunicheva, Nadiya
    Kalpeyeva, Zhuldyz
    Muhamedijeva, Elena
    Gopejenko, Viktors
    Rystygulov, Panabek
    APPLIED SCIENCES-BASEL, 2024, 14 (17):
  • [47] Protostellar classification using supervised machine learning algorithms
    Miettinen, O.
    ASTROPHYSICS AND SPACE SCIENCE, 2018, 363 (09)
  • [48] Liver Diseases Classification Using Machine Learning Algorithms
    Jovovic, Ivan
    Grebovic, Marko
    Pokvic, Lejla Gurbeta
    Popovic, Tomo
    Cakic, Stevan
    MEDICON 2023 AND CMBEBIH 2023, VOL 1, 2024, 93 : 585 - 593
  • [49] Classification of Swallowing Foods Using Machine Learning Algorithms
    Lim, Ji Hyun
    Djuric, Petar M.
    Stanacevic, Milutin
    INTERNATIONAL CONFERENCE ON ELECTRICAL, COMPUTER AND ENERGY TECHNOLOGIES (ICECET 2021), 2021, : 1571 - 1574
  • [50] Classification of Rheumatoid Arthritis using Machine Learning Algorithms
    Ho, Sharon
    Elamvazuthi, I.
    Lu, C. K.
    2018 IEEE 4TH INTERNATIONAL SYMPOSIUM IN ROBOTICS AND MANUFACTURING AUTOMATION (ROMA), 2018,