Feature Extraction based Text Classification using K-Nearest Neighbor Algorithm

被引:0
|
作者
Azam, Muhammad [1 ]
Ahmed, Tanvir [1 ]
Sabah, Fahad [1 ]
Hussain, Muhammad Iftikhar [2 ,3 ]
机构
[1] Super Univ Lahore, Dept Comp Sci & Informat Technol, Lahore, Pakistan
[2] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[3] Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China
来源
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY | 2018年 / 18卷 / 12期
关键词
K-NN; naive bayes; text classification; rapid miner; feature extraction;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific publications has been increasing enormously, with this increase classification of scientific publications is becoming challenging task. The core objective of this research is to analyze the performance of classification algorithms using Scopus dataset. In text classification, classification and feature extraction from the document using extracted features are the major issues for decreasing the performances in different algorithms. In this paper, performances of classification algorithms such as Naive Bayes (NB) and K-Nearest Neighbor (K-NN) shown better improvement using Bayesian boost and bagging. The performance results were analyzed through selected classification algorithms over 10K documents from Scopus examined using F-measure and produced comparison matrices to estimate accuracy, precision and recall using NB and KNN classifier. Further, data preprocessing and cleaning steps are induced on the selected dataset and class imbalance issues are analyzed to increase the performance of text classification algorithms. Experimental results showed performances over 7% using K-NN and revealed better as compared to NB.
引用
收藏
页码:95 / 101
页数:7
相关论文
共 50 条
  • [1] Novel text classification based on K-nearest neighbor
    Yu, Xiao-Peng
    Yu, Xiao-Gao
    PROCEEDINGS OF 2007 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2007, : 3425 - +
  • [2] An Improved K-Nearest Neighbor Algorithm for Pattern Classification
    Sultana, Zinnia
    Ferdousi, Ashifatul
    Tasnim, Farzana
    Nahar, Lutfun
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2022, 13 (08) : 760 - 767
  • [3] Arabic Text Classification Using K-Nearest Neighbour Algorithm
    Alhutaish, Roiss
    Omar, Nazlia
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2015, 12 (02) : 190 - 195
  • [4] Effective Classification of EEG Signals using K-Nearest Neighbor Algorithm
    Awan, Umer I.
    Rajput, U. H.
    Syed, Ghazaal
    Iqbal, Rimsha
    Sabat, Ifra
    Mansoor, M.
    PROCEEDINGS OF 14TH INTERNATIONAL CONFERENCE ON FRONTIERS OF INFORMATION TECHNOLOGY PROCEEDINGS - FIT 2016, 2016, : 120 - 124
  • [5] Protein Sequence Classification Based on N-Gram and K-Nearest Neighbor Algorithm
    Dongardive, Jyotshna
    Abraham, Siby
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, CIDM, VOL 2, 2016, 411 : 163 - 171
  • [6] Feature Extraction Algorithm Based on K Nearest Neighbor Local Margin
    Pan, Feng
    Wang, Jiandong
    Lin, Xiaohui
    PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, : 20 - +
  • [7] KRA: K-Nearest Neighbor Retrieval Augmented Model for Text Classification
    Li, Jie
    Tang, Chang
    Lei, Zhechao
    Zhang, Yirui
    Li, Xuan
    Yu, Yanhua
    Pi, Renjie
    Hu, Linmei
    ELECTRONICS, 2024, 13 (16)
  • [8] Enhancing data classification using locally informed weighted k-nearest neighbor algorithm
    Abdalla, Hassan, I
    Amer, Ali A.
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 276
  • [9] Feature Extraction, Selection, and K-Nearest Neighbors Algorithm for Shark Behavior Classification Based on Imbalanced Dataset
    Yang, Yu
    Yeh, Hen-Geul
    Zhang, Wenlu
    Lee, Calvin J.
    Meese, Emily N.
    Lowe, Christopher G.
    IEEE SENSORS JOURNAL, 2021, 21 (05) : 6429 - 6439
  • [10] Melon Ripeness Determination Using K-nearest Neighbor Algorithm
    Samar, Homer John M.
    Manalang, Hernanny Jeremy J.
    Villaverde, Jocelyn F.
    2024 16TH INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING, ICCAE 2024, 2024, : 461 - 466