Web mining with relational clustering

被引:50
|
作者
Runkler, TA
Bezdek, JC
机构
[1] Siemens Corp Technol, D-81730 Munich, Germany
[2] Univ W Florida, Dept Comp Sci, Pensacola, FL 32514 USA
关键词
fuzzy clustering; relational data; keyword extraction; web content mining; web log mining; click stream analysis;
D O I
10.1016/S0888-613X(02)00084-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an unsupervised learning method that determines partitions and (possibly) prototypes from pattern sets. Sets of numerical patterns can be clustered by alternating optimization (AO) of clustering objective functions or by alternating cluster estimation (ACE). Sets of non-numerical patterns can often be represented numerically by (pairwise) relations. These relational data sets can be clustered by relational AO and by relational ACE (RACE). We consider two kinds of non-numerical patterns provided by the World Wide Web: document contents such as the text parts of web pages, and sequences of web pages visited by particular users, so-called web logs. The analysis of document contents is often called web content mining, and the analysis of log files with web page sequences is called web log mining. For both non-numerical pattern types (text and web page sequences) relational data sets can be automatically generated using the Levenshtein (edit) distance or using graph distances. The prototypes found for text data can be interpreted as keywords that serve for document classification and automatic archiving. The prototypes found for web page sequences can be interpreted as prototypical click streams that indicate typical user interests, and therefore serve as a basis for web content and web structure management. (C) 2002 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:217 / 236
页数:20
相关论文
共 50 条
  • [21] Using Incremental Fuzzy Clustering to Web Usage Mining
    Aghabozorgi, Saeed R.
    Teh, Ying Wah
    2009 INTERNATIONAL CONFERENCE OF SOFT COMPUTING AND PATTERN RECOGNITION, 2009, : 653 - 658
  • [22] Web Usage Mining based on Clustering of Browsing Features
    Lee, Chu-Hui
    Fu, Yu-Hsiang
    ISDA 2008: EIGHTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS, VOL 1, PROCEEDINGS, 2008, : 281 - 286
  • [23] Web service clustering using text mining techniques
    Liu, Wei
    Wong, Wilson
    International Journal of Agent-Oriented Software Engineering, 2009, 3 (01) : 6 - 26
  • [24] Adaptive support vector clustering for multi-relational data mining
    Ling, Ping
    Zhou, Chun-Guang
    ADVANCES IN NEURAL NETWORKS - ISNN 2006, PT 1, 2006, 3971 : 1222 - 1230
  • [25] Road traffic accident data mining based on grey relational clustering
    Liu Y.
    Xu H.
    Zhang C.
    Shi X.D.
    Patnaik S.
    Advances in Transportation Studies, 2023, 3 (Special issue): : 113 - 124
  • [26] Rough-fuzzy relational clustering algorithm for biological sequence mining
    Maji, Pradipta
    Pal, Sankar K.
    ROUGH SETS AND KNOWLEDGE TECHNOLOGY, 2008, 5009 : 292 - 299
  • [27] Association Based Classification for Relational Data and Its Use in Web Mining
    Bartik, Vladimir
    2009 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DATA MINING, 2009, : 252 - 258
  • [28] On Reducing Redundancy in Mining Relational Association Rules from the Semantic Web
    Jozefowska, Joanna
    Lawrynowicz, Agnieszka
    Lukaszewski, Tomasz
    WEB REASONING AND RULE SYSTEMS, PROCEEDINGS, 2008, 5341 : 205 - 213
  • [29] Web log mining based on improved FCM clustering algorithm
    Wang Zhijun
    Zhou Runjing
    INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND PATTERN RECOGNITION IN INDUSTRIAL ENGINEERING, 2010, 7820
  • [30] Frequent Itemset Mining for Clustering Near Duplicate Web Documents
    Ignatov, Dmitry I.
    Kuznetsov, Sergei O.
    CONCEPTUAL STRUCTURES: LEVERAGING SEMANTIC TECHNOLOGIES, PROCEEDINGS, 2009, 5662 : 185 - 200