KNN with TF-IDF Based Framework for Text Categorization

被引:151
作者
Trstenjak, Bruno [1 ]
Mikac, Sasa [2 ]
Donko, Dzenana [3 ]
机构
[1] Medimurje Univ Appl Sci Cakovec, Dept Comp Engn, Cakovec, Croatia
[2] Fac Elect Engn & Comp Sci, Dept Comp Sci, Maribor, Slovenia
[3] Fac Elect Engn, Dept Comp Sci, Sarajevo, Bosnia & Herceg
来源
24TH DAAAM INTERNATIONAL SYMPOSIUM ON INTELLIGENT MANUFACTURING AND AUTOMATION, 2013 | 2014年 / 69卷
关键词
text documents classification; K-Nearest Neighbor; TF-IDF; framework; machine learning;
D O I
10.1016/j.proeng.2014.03.129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
KNN is a very popular algorithm for text classification. This paper presents the possibility of using KNN algorithm with TF-IDF method and framework for text classification. Framework enables classification according to various parameters, measurement and analysis of results. Evaluation of framework was focused on the speed and quality of classification. The results of testing showed the good and bad features of algorithm, providing guidance for the further development of similar frameworks. (C) 2014 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1356 / 1364
页数:9
相关论文
共 50 条
  • [41] A detection method for android application security based on TF-IDF and machine learning
    Yuan, Hongli
    Tang, Yongchuan
    Sun, Wenjuan
    Liu, Li
    PLOS ONE, 2020, 15 (09):
  • [42] An Image Classification Method Based on Matching Similarity and TF-IDF Value of Region
    Xu, Donghua
    Qu, Zhiyi
    2013 SIXTH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID), VOL 2, 2013, : 112 - 115
  • [43] Comments Mining With TF-IDF: The Inherent Bias and Its Removal
    Yahav, Inbal
    Shehory, Onn
    Schwartz, David
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2019, 31 (03) : 437 - 450
  • [44] Assessment of Machine Learning Models in Detecting DGA Botnet in Characteristics by TF-IDF
    Tong Anh Tuan
    Nguyen Viet Anh
    Hoang Viet Long
    2021 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLIED NETWORK TECHNOLOGIES (ICMLANT II), 2021, : 79 - 83
  • [45] Embedding User Behavioral Aspect in TF-IDF like Representation
    Pradhan, Ligaj
    Zhang, Chengcui
    Bethard, Steven
    Chen, Xin
    IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018), 2018, : 262 - 267
  • [46] Summarization of Daily News Using TextRank and TF-IDF Algorithm
    Jain, Rekha
    Singh, Poonam
    Puri, Shalini
    FOURTH CONGRESS ON INTELLIGENT SYSTEMS, VOL 3, CIS 2023, 2024, 865 : 313 - 324
  • [47] Exploiting tf-idf in deep Convolutional Neural Networks for Content Based Image Retrieval
    Kondylidis, Nikolaos
    Tzelepi, Maria
    Tefas, Anastasios
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (23) : 30729 - 30748
  • [48] Analysis of TF-IDF Model and its Variant for Document Retrieval
    Mishra, Apra
    Vishwakarma, Santosh
    2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMMUNICATION NETWORKS (CICN), 2015, : 772 - 776
  • [49] Indoor Scene Recognition via Object Detection and TF-IDF
    Heikel, Edvard
    Espinosa-Leal, Leonardo
    JOURNAL OF IMAGING, 2022, 8 (08)
  • [50] A Method of Extracting Malware Features Based on Gini Impurity Increment and Improved TF-IDF
    Sun, Shimiao
    Liu, Yashu
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2023, 20 (03) : 419 - 427