Feature Extraction based Text Classification using K-Nearest Neighbor Algorithm

被引:0
作者
Azam, Muhammad [1 ]
Ahmed, Tanvir [1 ]
Sabah, Fahad [1 ]
Hussain, Muhammad Iftikhar [2 ,3 ]
机构
[1] Super Univ Lahore, Dept Comp Sci & Informat Technol, Lahore, Pakistan
[2] Beijing Univ Technol, Fac Informat Technol, Beijing 100124, Peoples R China
[3] Beijing Univ Technol, Beijing Engn Res Ctr IoT Software & Syst, Beijing 100124, Peoples R China
来源
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY | 2018年 / 18卷 / 12期
关键词
K-NN; naive bayes; text classification; rapid miner; feature extraction;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scientific publications has been increasing enormously, with this increase classification of scientific publications is becoming challenging task. The core objective of this research is to analyze the performance of classification algorithms using Scopus dataset. In text classification, classification and feature extraction from the document using extracted features are the major issues for decreasing the performances in different algorithms. In this paper, performances of classification algorithms such as Naive Bayes (NB) and K-Nearest Neighbor (K-NN) shown better improvement using Bayesian boost and bagging. The performance results were analyzed through selected classification algorithms over 10K documents from Scopus examined using F-measure and produced comparison matrices to estimate accuracy, precision and recall using NB and KNN classifier. Further, data preprocessing and cleaning steps are induced on the selected dataset and class imbalance issues are analyzed to increase the performance of text classification algorithms. Experimental results showed performances over 7% using K-NN and revealed better as compared to NB.
引用
收藏
页码:95 / 101
页数:7
相关论文
共 50 条
[31]   An Improved K-Nearest Neighbor Algorithm Using Tree Structure and Pruning Technology [J].
Li, Juan .
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2019, 25 (01) :35-48
[32]   Human Activity Recognition Using K-Nearest Neighbor Machine Learning Algorithm [J].
Mohsen, Saeed ;
Elkaseer, Ahmed ;
Scholz, Steffen G. .
SUSTAINABLE DESIGN AND MANUFACTURING, KES-SDM 2021, 2022, 262 :304-313
[33]   Effective k-nearest neighbor models for data classification enhancement [J].
Amer, Ali A. ;
Ravana, Sri Devi ;
Habeeb, Riyaz Ahamed Ariyaluran .
JOURNAL OF BIG DATA, 2025, 12 (01)
[34]   Implementation of the Advanced Traffic Management System using k-Nearest Neighbor Algorithm [J].
Lusiandro, Muchammad Arfan ;
Nasution, Surya Michrandi ;
Setianingsih, Casi .
2020 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY SYSTEMS AND INNOVATION (ICITSI), 2020, :149-154
[35]   Research on the Improvement of K-Nearest Neighbor Classifier for Imbalanced Text Categorization [J].
Yang Yanmei ;
Xu Linying .
2018 EIGHTH INTERNATIONAL CONFERENCE ON INSTRUMENTATION AND MEASUREMENT, COMPUTER, COMMUNICATION AND CONTROL (IMCCC 2018), 2018, :968-972
[36]   Multi-GPU Implementation of k-Nearest Neighbor Algorithm [J].
Masek, Jan ;
Burget, Kadim ;
Karasek, Jan ;
Uher, Vaclav ;
Dutta, Malay Kishore .
2015 38TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2015, :764-767
[37]   Botnet Identification Based on Flow Traffic by Using K-Nearest Neighbor [J].
Gunawan, Dani ;
Hairani, Tika ;
Hizriadi, Ainul .
2019 11TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS 2019), 2019, :95-99
[38]   Deep Learning-Based Metaheuristic Weighted K-Nearest Neighbor Algorithm for the Severity Classification of Breast Cancer [J].
Chakravarthy, S. R. Sannasi ;
Bharanidharan, N. ;
Rajaguru, H. .
IRBM, 2023, 44 (03)
[39]   Analysis of Synthetic Data Utilization with Generative Adversarial Network in Flood Classification using K-Nearest Neighbor Algorithm [J].
Afriza, Wahyu ;
Riasetiawan, Mardhani ;
Tyas, Dyah Aruming .
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (12) :678-683
[40]   Hybrid SORN Implementation of k-Nearest Neighbor Algorithm on FPGA [J].
Huelsmeier, Nils ;
Baerthel, Moritz ;
Karsthof, Ludwig ;
Rust, Jochen ;
Paul, Steffen .
2022 20TH IEEE INTERREGIONAL NEWCAS CONFERENCE (NEWCAS), 2022, :163-167