KNN with TF-IDF Based Framework for Text Categorization

被引:151
|
作者
Trstenjak, Bruno [1 ]
Mikac, Sasa [2 ]
Donko, Dzenana [3 ]
机构
[1] Medimurje Univ Appl Sci Cakovec, Dept Comp Engn, Cakovec, Croatia
[2] Fac Elect Engn & Comp Sci, Dept Comp Sci, Maribor, Slovenia
[3] Fac Elect Engn, Dept Comp Sci, Sarajevo, Bosnia & Herceg
来源
24TH DAAAM INTERNATIONAL SYMPOSIUM ON INTELLIGENT MANUFACTURING AND AUTOMATION, 2013 | 2014年 / 69卷
关键词
text documents classification; K-Nearest Neighbor; TF-IDF; framework; machine learning;
D O I
10.1016/j.proeng.2014.03.129
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
KNN is a very popular algorithm for text classification. This paper presents the possibility of using KNN algorithm with TF-IDF method and framework for text classification. Framework enables classification according to various parameters, measurement and analysis of results. Evaluation of framework was focused on the speed and quality of classification. The results of testing showed the good and bad features of algorithm, providing guidance for the further development of similar frameworks. (C) 2014 The Authors. Published by Elsevier Ltd.
引用
收藏
页码:1356 / 1364
页数:9
相关论文
共 50 条
  • [21] Using TF-IDF to hide sensitive itemsets
    Tzung-Pei Hong
    Chun-Wei Lin
    Kuo-Tung Yang
    Shyue-Liang Wang
    Applied Intelligence, 2013, 38 : 502 - 510
  • [22] Using TF-IDF to hide sensitive itemsets
    Hong, Tzung-Pei
    Lin, Chun-Wei
    Yang, Kuo-Tung
    Wang, Shyue-Liang
    APPLIED INTELLIGENCE, 2013, 38 (04) : 502 - 510
  • [23] A new neutrosophic TF-IDF term weighting for text mining tasks: text classification use case
    Bounabi, Mariem
    Elmoutaouakil, Karim
    Satori, Khalid
    INTERNATIONAL JOURNAL OF WEB INFORMATION SYSTEMS, 2021, 17 (03) : 229 - 249
  • [24] Detection of DGA-Generated Domain Names with TF-IDF
    Vranken, Harald
    Alizadeh, Hassan
    ELECTRONICS, 2022, 11 (03)
  • [25] Novel Curriculum Learning Strategy Using Class-Based TF-IDF for Enhancing Personality Detection in Text
    Kwon, Naae
    Yoo, Yuenkyung
    Lee, Byunghan
    IEEE ACCESS, 2024, 12 : 87873 - 87882
  • [26] News keywords extraction algorithm based on TextRank and classified TF-IDF
    Ao, Xiong
    Yu, Xin
    Liu, Derong
    Tian, Hongkang
    2020 16TH INTERNATIONAL WIRELESS COMMUNICATIONS & MOBILE COMPUTING CONFERENCE, IWCMC, 2020, : 1364 - 1369
  • [27] Microblogging Hash Tag Recommendation System Based on Semantic TF-IDF
    Tajbakhsh, Mir Saman
    Bagherzadeh, Jamshid
    2016 IEEE 4TH INTERNATIONAL CONFERENCE ON FUTURE INTERNET OF THINGS AND CLOUD WORKSHOPS (FICLOUDW), 2016, : 252 - 257
  • [28] A gene pathway enrichment method based on improved TF-IDF algorithm
    Xu, Shutan
    Leng, Yinhui
    Feng, Guofu
    Zhang, Chenjing
    Chen, Ming
    BIOCHEMISTRY AND BIOPHYSICS REPORTS, 2023, 34
  • [29] Research paper classification systems based on TF-IDF and LDA schemes
    Kim, Sang-Woon
    Gil, Joon-Min
    HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2019, 9 (01)
  • [30] Continuous Speech Recognition with a TF-IDF Acoustic Model
    Zweig, Geoffrey
    Patrick Nguyen
    Droppo, Jasha
    Acero, Alex
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2858 - 2861