An encoding technique based on word importance for the clustering of web documents

被引:0
|
作者
Zakos, J [1 ]
Verma, B [1 ]
机构
[1] Griffith Univ, Sch Informat Technol, Gold Coast, Qld 9726, Australia
来源
ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE | 2002年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a word encoding and clustering technique that groups web documents based on the importance of the words that appear in the documents. We use a two level self-organizing map architecture to generate clusters of words and documents. We propose that by capturing word importance information of words, similar documents can be then clustered to assist in web document retrieval. A web document retrieval system is presented to demonstrate how this approach could be integrated into web search.
引用
收藏
页码:2207 / 2211
页数:5
相关论文
共 50 条
  • [31] A new approach for fuzzy clustering of web documents
    Friedman, M
    Last, M
    Zaafrany, O
    Schneider, M
    Kandel, A
    2004 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, PROCEEDINGS, 2004, : 377 - 381
  • [32] A Word Clustering-Based Crime Report Categorization Technique
    Das, Priyanka
    Das, Asit Kumar
    COMPUTATIONAL INTELLIGENCE IN PATTERN RECOGNITION, CIPR 2020, 2020, 1120 : 451 - 463
  • [33] Fuzzy co-clustering of web documents
    William-Chandra, T
    Chen, L
    2005 INTERNATIONAL CONFERENCE ON CYBERWORLDS, PROCEEDINGS, 2005, : 545 - 551
  • [34] Personalized Metaheuristic Clustering Onto Web Documents
    Wookey Lee
    潍坊学院学报, 2004, (04) : 1 - 4
  • [35] Contextual Query based on Segmentation and Clustering of Selected Documents for Acquiring Web Documents for Supporting Knowledge Management
    Prates, Joao C.
    Siqueira, Sean S. M.
    AMCIS 2011 PROCEEDINGS, 2011,
  • [36] A consistent web documents based text clustering using concept based mining model
    Navaneethakumar, V.M.
    Chandrasekar, C.
    International Journal of Computer Science Issues, 2012, 9 (4 4-1): : 365 - 370
  • [37] Word Embedding-based Web Service Representations for Classification and Clustering
    Zhang, Xiangping
    Liu, Jianxun
    Shi, Min
    Cao, Buqing
    2021 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2021), 2021, : 34 - 43
  • [38] Parallelization of a graph-cut based algorithm for hierarchical clustering of web documents
    Seshadri, Karthick
    Shalinie, S. Mercy
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (17): : 5156 - 5176
  • [39] Semantic Clustering of Web Documents: An Ontology based Approach Using Swarm Intelligence
    Avanija, J.
    Ramar, K.
    INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY AND WEB ENGINEERING, 2012, 7 (04) : 20 - 33
  • [40] Using word clusters to detect similar web documents
    Koberstein, Jonathan
    Ng, Yiu-Kai
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, 2006, 4092 : 215 - 228