Automatic text classification using machine learning and optimization algorithms

被引:15
|
作者
Janani, R. [1 ]
Vijayarani, S. [1 ]
机构
[1] Bharathiar Univ, Dept Comp Sci, Coimbatore, Tamil Nadu, India
关键词
Text mining; Information retrieval; Document classification; Content analysis; Feature selection; Bio-inspired algorithms; PSO; ACO; ABC; FA; OTFS algorithm; Machine learning algorithms; NB; KNN; SVM; PNN; MLearn-ATC; DOCUMENTS;
D O I
10.1007/s00500-020-05209-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the recent years, the volume of text documents in the form of digital way has grown up extremely in size. As significance, there is a need to be competent to automatically bring together and classify the documents based on their content. The main goal of text classification is to partition the unstructured set of documents into their respective categories based on its content. The main aim of this research work is to automatically classify the documents which are stored in the personal computer into their relevant categories. This work has two significant phases. In the first phase, the important features are selected for classification and the second phase is the classification of text documents. For selecting the optimal features, this research work proposes a new algorithm, optimization technique for feature selection (OTFS) algorithm. To estimate the proficiency of proposed feature selection algorithm, the OTFS algorithm was compared with the existing approaches artificial bee colony, firefly algorithm, ant colony optimization and particle swarm optimization. In the second phase, this research work proposed machine learning-based automatic text classification (MLearn-ATC) algorithm for text classification. In classification, the MLearn-ATC algorithm was compared with widely used classification techniques probabilistic neural network, support vector machine, K-nearest neighbor and Naive Bayes. From this, the output of first phase is used as the input for classification phase. The decisive results establish that the proposed algorithms achieve the better accuracy for optimizing the features and classifying the text documents based on their content.
引用
收藏
页码:1129 / 1145
页数:17
相关论文
共 50 条
  • [31] Academic Registration Text Classification Using Machine Learning
    Alhawas, Mohammed S.
    Almurayziq, Tariq S.
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (01): : 93 - 96
  • [32] Text Classification for Azerbaijani Language Using Machine Learning
    Suleymanov, Umid
    Kalejahi, Behnam Kiani
    Amrahov, Elkhan
    Badirkhanli, Rashid
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (06): : 467 - 475
  • [33] Automatic Machine Learning Classification Algorithms for Stability Detection of Smart Grid
    Yousif, Suhad A.
    Samawi, Venus W.
    Al-Saidi, Nadia M. G.
    2022 IEEE THE 5TH INTERNATIONAL CONFERENCE ON BIG DATA AND ARTIFICIAL INTELLIGENCE (BDAI 2022), 2022, : 34 - 39
  • [34] Automatic Classification of Pathological Gait Patterns using Ground Reaction Forces and Machine Learning Algorithms
    Alaqtash, Murad
    Sarkodie-Gyan, Thompson
    Yu, Huiying
    Fuentes, Olac
    Brower, Richard
    Abdelgawad, Amr
    2011 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2011, : 453 - 457
  • [35] Machine Learning Algorithms applied in Automatic Classification of Social Network Users
    Alves de Lima, Bruno Vicente
    Machado, Vinicius Ponte
    2012 FOURTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ASPECTS OF SOCIAL NETWORKS (CASON), 2012, : 58 - 62
  • [36] A Comparison of Different Machine Learning Algorithms for Automatic Classification of Sonar Targets
    Berg, Henrik
    Hjelmervik, Karl Thomas
    Stender, Dan Henrik Sekse
    Sastad, Tale Solberg
    OCEANS 2016 MTS/IEEE MONTEREY, 2016,
  • [37] Classification of Cardiac Arrhythmias Using Machine Learning Algorithms
    Garcia-Aquino, Christian
    Mujica-Vargas, Dante
    Matuz-Cruz, Manuel
    TELEMATICS AND COMPUTING, WITCOM 2021, 2021, 1430 : 174 - 185
  • [38] Automatic Patents Classification Using Supervised Machine Learning
    Shahid, Muhammad
    Ahmed, Adeel
    Mushtaq, Muhammad Faheem
    Ullah, Saleem
    Matiullah
    Akram, Urooj
    RECENT ADVANCES ON SOFT COMPUTING AND DATA MINING (SCDM 2020), 2020, 978 : 297 - 307
  • [39] Automatic tortuosity classification using machine learning approach
    Turior, Rashmi
    Chutinantvarodom, Pornthep
    Uyyanonvara, Bunyarit
    INDUSTRIAL INSTRUMENTATION AND CONTROL SYSTEMS, PTS 1-4, 2013, 241-244 : 3143 - 3147
  • [40] Zonda wind classification using machine learning algorithms
    Otero, Federico
    Araneo, Diego
    INTERNATIONAL JOURNAL OF CLIMATOLOGY, 2021, 41 (S1) : E342 - E353