Automatic text classification using machine learning and optimization algorithms

被引:16
|
作者
Janani, R. [1 ]
Vijayarani, S. [1 ]
机构
[1] Bharathiar Univ, Dept Comp Sci, Coimbatore, Tamil Nadu, India
关键词
Text mining; Information retrieval; Document classification; Content analysis; Feature selection; Bio-inspired algorithms; PSO; ACO; ABC; FA; OTFS algorithm; Machine learning algorithms; NB; KNN; SVM; PNN; MLearn-ATC; DOCUMENTS;
D O I
10.1007/s00500-020-05209-8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the recent years, the volume of text documents in the form of digital way has grown up extremely in size. As significance, there is a need to be competent to automatically bring together and classify the documents based on their content. The main goal of text classification is to partition the unstructured set of documents into their respective categories based on its content. The main aim of this research work is to automatically classify the documents which are stored in the personal computer into their relevant categories. This work has two significant phases. In the first phase, the important features are selected for classification and the second phase is the classification of text documents. For selecting the optimal features, this research work proposes a new algorithm, optimization technique for feature selection (OTFS) algorithm. To estimate the proficiency of proposed feature selection algorithm, the OTFS algorithm was compared with the existing approaches artificial bee colony, firefly algorithm, ant colony optimization and particle swarm optimization. In the second phase, this research work proposed machine learning-based automatic text classification (MLearn-ATC) algorithm for text classification. In classification, the MLearn-ATC algorithm was compared with widely used classification techniques probabilistic neural network, support vector machine, K-nearest neighbor and Naive Bayes. From this, the output of first phase is used as the input for classification phase. The decisive results establish that the proposed algorithms achieve the better accuracy for optimizing the features and classifying the text documents based on their content.
引用
收藏
页码:1129 / 1145
页数:17
相关论文
共 50 条
  • [31] Multi-Class Electrogastrogram (EGG) Signal Classification Using Machine Learning Algorithms
    Raihan, Md Mohsin Sarker
    Bin Shams, Abdullah
    Bin Preo, Rahat
    2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
  • [32] Sentiment analysis using various machine learning algorithms for disaster related tweets classification
    Sudha, S. Baby
    Dhanalakshmi, S.
    INTERNATIONAL JOURNAL OF INTELLIGENT ENGINEERING INFORMATICS, 2023, 11 (04) : 390 - 417
  • [33] Text Classification Algorithms: A Survey
    Kowsari, Kamran
    Meimandi, Kiana Jafari
    Heidarysafa, Mojtaba
    Mendu, Sanjana
    Barnes, Laura
    Brown, Donald
    INFORMATION, 2019, 10 (04)
  • [34] A Comprehensive Review on Email Spam Classification using Machine Learning Algorithms
    Raza, Mansoor
    Jayasinghe, Nathali Dilshani
    Muslam, Muhana Magboul Ali
    35TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2021), 2021, : 327 - 332
  • [35] Comparison of machine learning algorithms for the classification of spinal cord tumor
    Garg, Sheetal
    Raghavan, Bhagyashree
    IRISH JOURNAL OF MEDICAL SCIENCE, 2024, 193 (02) : 571 - 575
  • [36] Detection and Classification of Conflict Flows in SDN Using Machine Learning Algorithms
    Khairi, Mutaz Hamed Hussien
    Ariffin, Sharifah Hafizah Syed
    Latiff, Nurul Mu'Azzah Abdul
    Yusof, Kamaludin Mohamad
    Hassan, Mohamed Khalafalla
    Al-Dhief, Fahad Taha
    Hamdan, Mosab
    Khan, Suleman
    Hamzah, Muzaffar
    IEEE ACCESS, 2021, 9 (09): : 76024 - 76037
  • [37] Comparative Analysis of Machine Learning Algorithms for Audio Signals Classification
    Mahana, Poonam
    Singh, Gurbhej
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2015, 15 (06): : 49 - 55
  • [38] Comparison of machine learning algorithms for the classification of spinal cord tumor
    Sheetal Garg
    Bhagyashree Raghavan
    Irish Journal of Medical Science (1971 -), 2024, 193 : 571 - 575
  • [39] An Experimental Analysis of Machine Learning Classification Algorithms on Biomedical Data
    Das, Himansu
    Naik, Bighnaraj
    Behera, H. S.
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, DEVICES AND COMPUTING, 2020, 602 : 525 - 539
  • [40] Medical Data Clustering and Classification Using TLBO and Machine Learning Algorithms
    Dubey, Ashutosh Kumar
    Gupta, Umesh
    Jain, Sonal
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 70 (03): : 4523 - 4543