Implicit Links-Based Techniques to Enrich K-Nearest Neighbors and Naive Bayes Algorithms forWeb Page Classification

被引:1
|
作者
Belmouhcine, Abdelbadie [1 ]
Benkhalifa, Mohammed [1 ]
机构
[1] Mohammed V Univ, Comp Sci Dept, Sci Fac, Comp Sci Lab LRI, Rabat, Morocco
关键词
D O I
10.1007/978-3-319-26227-7_71
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The web has developed into one of the most relevant data sources and becomes now a broad knowledge base for almost all fields. Its content grows faster, and its size becomes larger every day. Due to this big amount of data, web page classification becomes crucial since users encounter difficulties in finding what they are seeking, even though they use search engines. Web page classification is the process of assigning a web page to one or more classes based on previously seen labeled examples. Web pages contain a lot of contextual features that can be used to enhance the classification's accuracy. In this paper, we present a similarity computation technique that is based on implicit links extracted from the query-log, and used with K-Nearest Neighbors (KNN) in web page classification. We also introduce an implicit links-based probability computation method used with Naive Bayes (NB) for web page classification. The new computed similarity and probability help enrich KNNand NB respectively for web page classification. Experiments are conducted on two subsets of Open Directory Project (ODP). Results show that: (1) when applied as a similarity for KNN, the implicit links-based similarity helps improve results. (2) the implicit links-based probability helps ameliorate results provided by NB using only text-based probability.
引用
收藏
页码:755 / 766
页数:12
相关论文
共 47 条
  • [1] Analyzing the Impact of Principal Component Analysis on k-Nearest Neighbors and Naive Bayes Classification Algorithms
    Macionczyk, Rafal
    Moryc, Michal
    Buchtyar, Patryk
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2023, 2024, 1979 : 247 - 263
  • [2] Comparison of Support Vector Machine, Naive Bayes, and K-Nearest Neighbors Algorithms for Classifying Heart Disease
    Lewandowicz, Bartosz
    Kisiala, Konrad
    INFORMATION AND SOFTWARE TECHNOLOGIES, ICIST 2023, 2024, 1979 : 274 - 285
  • [3] Choose of Wart Treatment Method Using Naive Bayes and k-Nearest Neighbors Classifiers
    Uzun, Rukiye
    Isler, Yalcin
    Toksan, Mualla
    2018 26TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2018,
  • [4] Internet Traffic Detection using Naive Bayes and K-Nearest Neighbors (KNN) algorithm
    Dixit, Mrudul
    Sharma, Ritu
    Shaikh, Saniya
    Muley, Krutika
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 1153 - 1157
  • [5] A comparative study on thyroid disease detection using K-nearest neighbor and Naive Bayes classification techniques
    Khushboo Chandel
    Veenita Kunwar
    Sai Sabitha
    Tanupriya Choudhury
    Saurabh Mukherjee
    CSI Transactions on ICT, 2016, 4 (2-4) : 313 - 319
  • [6] Locally Adaptive Text Classification based k-nearest Neighbors
    Yu, Xiao-gao
    Yu, Xiao-peng
    2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15, 2007, : 5651 - +
  • [7] K-Nearest Neighbor and Naive Bayes Classifier Comparison for Individual Character Classification on Twitter
    Utami, Ema
    Raharjo, Suwanto
    Hartanto, Anggit Dwi
    Adi, Sumarni
    Ichsan, Aminudin Noor
    PROCEEDINGS OF ICORIS 2020: 2020 THE 2ND INTERNATIONAL CONFERENCE ON CYBERNETICS AND INTELLIGENT SYSTEM (ICORIS), 2020, : 63 - 67
  • [8] Time series labeling algorithms based on the K-nearest neighbors' frequencies
    Nasibov, Efendi N.
    Peker, Sinem
    EXPERT SYSTEMS WITH APPLICATIONS, 2011, 38 (05) : 5028 - 5035
  • [9] Machine learning classification based on k-Nearest Neighbors for PolSAR data
    Ferreira, Jodavid A.
    Rodrigues, Anny K. G.
    Ospina, Raydonal
    Gomez, Luis
    ANAIS DA ACADEMIA BRASILEIRA DE CIENCIAS, 2024, 96 (01):
  • [10] Classification of incomplete data based on belief functions and K-nearest neighbors
    Liu, Zhun-ga
    Liu, Yong
    Dezert, Jean
    Pan, Quan
    KNOWLEDGE-BASED SYSTEMS, 2015, 89 : 113 - 125