Web mining with relational clustering

被引:50
|
作者
Runkler, TA
Bezdek, JC
机构
[1] Siemens Corp Technol, D-81730 Munich, Germany
[2] Univ W Florida, Dept Comp Sci, Pensacola, FL 32514 USA
关键词
fuzzy clustering; relational data; keyword extraction; web content mining; web log mining; click stream analysis;
D O I
10.1016/S0888-613X(02)00084-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an unsupervised learning method that determines partitions and (possibly) prototypes from pattern sets. Sets of numerical patterns can be clustered by alternating optimization (AO) of clustering objective functions or by alternating cluster estimation (ACE). Sets of non-numerical patterns can often be represented numerically by (pairwise) relations. These relational data sets can be clustered by relational AO and by relational ACE (RACE). We consider two kinds of non-numerical patterns provided by the World Wide Web: document contents such as the text parts of web pages, and sequences of web pages visited by particular users, so-called web logs. The analysis of document contents is often called web content mining, and the analysis of log files with web page sequences is called web log mining. For both non-numerical pattern types (text and web page sequences) relational data sets can be automatically generated using the Levenshtein (edit) distance or using graph distances. The prototypes found for text data can be interpreted as keywords that serve for document classification and automatic archiving. The prototypes found for web page sequences can be interpreted as prototypical click streams that indicate typical user interests, and therefore serve as a basis for web content and web structure management. (C) 2002 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:217 / 236
页数:20
相关论文
共 50 条
  • [11] Web Service Clustering Using Relational Database Approach
    Liu, Jianxiao
    Liu, Feng
    Li, Xiaoxia
    He, Keqing
    Ma, Yutao
    Wang, Jian
    INTERNATIONAL JOURNAL OF SOFTWARE ENGINEERING AND KNOWLEDGE ENGINEERING, 2015, 25 (08) : 1365 - 1393
  • [12] A distributed hierarchical clustering system for Web mining
    Wen, CW
    Liu, H
    Wen, WX
    Zheng, J
    ADVANCES IN WEB-AGE INFORMATION MANAGEMENT, PROCEEDINGS, 2001, 2118 : 103 - 113
  • [13] AntClust:: Ant clustering and web usage mining
    Labroche, N
    Monmarché, N
    Venturini, G
    GENETIC AND EVOLUTIONARY COMPUTATION - GECCO 2003, PT I, PROCEEDINGS, 2003, 2723 : 25 - 36
  • [14] A New Clustering and Preprocessing for Web Log Mining
    Maheswari, B. Uma
    Sumathi, P.
    2014 WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT 2014), 2014, : 25 - +
  • [15] A User Clustering Algorithm on Web Usage Mining
    Sun Hao
    Shen Zhaoxiang
    Zhang Bingbing
    PROCEEDINGS FIRST INTERNATIONAL CONFERENCE ON ELECTRONICS INSTRUMENTATION & INFORMATION SYSTEMS (EIIS 2017), 2017, : 919 - 922
  • [16] Web Usage Mining Based on Fuzzy Clustering
    Yu, Ya-Xiu
    Wang, Xin-Wei
    2009 INTERNATIONAL FORUM ON INFORMATION TECHNOLOGY AND APPLICATIONS, VOL 2, PROCEEDINGS, 2009, : 268 - 271
  • [17] Mining a Web citation database for document clustering
    He, Y
    Hui, SC
    Fong, ACM
    APPLIED ARTIFICIAL INTELLIGENCE, 2002, 16 (04) : 283 - 302
  • [18] Application of Grey Relational Clustering and Data Mining In Information Extraction
    Qu Zhiming
    Wang Xiaoli
    ISBIM: 2008 INTERNATIONAL SEMINAR ON BUSINESS AND INFORMATION MANAGEMENT, VOL 2, 2009, : 3 - +
  • [19] Relational mountain (density) clustering method and web log analysis
    Pal, K
    Pal, NR
    Keller, JM
    Bezdek, JC
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2005, 20 (03) : 375 - 392
  • [20] Integrating Web content clustering into Web log association rule mining
    Guo, J
    Keselj, V
    Gao, Q
    ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, 3501 : 182 - 193