Web mining with relational clustering

被引:50
|
作者
Runkler, TA
Bezdek, JC
机构
[1] Siemens Corp Technol, D-81730 Munich, Germany
[2] Univ W Florida, Dept Comp Sci, Pensacola, FL 32514 USA
关键词
fuzzy clustering; relational data; keyword extraction; web content mining; web log mining; click stream analysis;
D O I
10.1016/S0888-613X(02)00084-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an unsupervised learning method that determines partitions and (possibly) prototypes from pattern sets. Sets of numerical patterns can be clustered by alternating optimization (AO) of clustering objective functions or by alternating cluster estimation (ACE). Sets of non-numerical patterns can often be represented numerically by (pairwise) relations. These relational data sets can be clustered by relational AO and by relational ACE (RACE). We consider two kinds of non-numerical patterns provided by the World Wide Web: document contents such as the text parts of web pages, and sequences of web pages visited by particular users, so-called web logs. The analysis of document contents is often called web content mining, and the analysis of log files with web page sequences is called web log mining. For both non-numerical pattern types (text and web page sequences) relational data sets can be automatically generated using the Levenshtein (edit) distance or using graph distances. The prototypes found for text data can be interpreted as keywords that serve for document classification and automatic archiving. The prototypes found for web page sequences can be interpreted as prototypical click streams that indicate typical user interests, and therefore serve as a basis for web content and web structure management. (C) 2002 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:217 / 236
页数:20
相关论文
共 50 条
  • [31] A Kind of Improved Data Clustering Algorithm in Web Log Mining
    Guo, Jin
    Zhang, Shengbing
    Qiu, Zheng
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS RESEARCH AND MECHATRONICS ENGINEERING, 2015, 121 : 2115 - 2119
  • [32] Clustering web images by correlation mining of image-text
    Wu F.
    Han Y.-H.
    Zhuang Y.-T.
    Shao J.
    Ruan Jian Xue Bao/Journal of Software, 2010, 21 (07): : 1561 - 1575
  • [33] Web log mining based on immune network clustering algorithm
    College of Mathematics and Computer Science, Chongqing Normal University, Chongqing 400047, China
    J. Comput. Inf. Syst., 2007, 4 (1549-1554):
  • [34] A Unified Model for Preprocessing and Clustering Technique for Web Usage Mining
    Pandian, P. Senthil
    Srinivasan, S.
    JOURNAL OF MULTIPLE-VALUED LOGIC AND SOFT COMPUTING, 2016, 26 (3-5) : 205 - 220
  • [35] An efficient technique for mining usage profiles using relational fuzzy subtractive clustering
    Suryavanshi, BS
    Shiri, N
    Mudur, SP
    INTERNATIONAL WORKSHOP ON CHALLENGES IN WEB INFORMATION RETRIEVAL AND INTEGRATION, PROCEEDINGS, 2005, : 23 - 28
  • [36] Mining Evolving Web Sessions and Clustering Dynamic Web Documents for Similarity-Aware Web Content Management
    Xiao, Jitian
    ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2008, 5139 : 99 - 110
  • [37] Research on Improved Clustering Algorithm on Web Usage Mining based on Scientific Analysis of Web Materials
    Li, Bin
    Yang, Jin
    Liu, Caiming
    Zhang, Jiandong
    Zhang, Yan
    ADVANCED RESEARCH ON MECHANICAL ENGINEERING, INDUSTRY AND MANUFACTURING ENGINEERING, PTS 1 AND 2, 2011, 63-64 : 863 - +
  • [38] Application of Convolution Neural Networks in Web Search Log Mining for Effective Web Document Clustering
    Chawla, Suruchi
    INTERNATIONAL JOURNAL OF INFORMATION RETRIEVAL RESEARCH, 2022, 12 (01)
  • [39] Application of Grey Relational Clustering and Data Mining in Data Flow of E-Commerce
    Qu Zhiming
    Liang Xiaoying
    PROCEEDINGS OF THE 2009 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND NATURAL COMPUTING, VOL I, 2009, : 237 - 240
  • [40] Web usage mining based on fuzzy clustering in identifying target group
    Zhang, Jianxi
    Zhao, Peiying
    Shang, Lin
    Wang, Lunsheng
    2009 ISECS INTERNATIONAL COLLOQUIUM ON COMPUTING, COMMUNICATION, CONTROL, AND MANAGEMENT, VOL IV, 2009, : 209 - 212