Web mining with relational clustering

被引:50
|
作者
Runkler, TA
Bezdek, JC
机构
[1] Siemens Corp Technol, D-81730 Munich, Germany
[2] Univ W Florida, Dept Comp Sci, Pensacola, FL 32514 USA
关键词
fuzzy clustering; relational data; keyword extraction; web content mining; web log mining; click stream analysis;
D O I
10.1016/S0888-613X(02)00084-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an unsupervised learning method that determines partitions and (possibly) prototypes from pattern sets. Sets of numerical patterns can be clustered by alternating optimization (AO) of clustering objective functions or by alternating cluster estimation (ACE). Sets of non-numerical patterns can often be represented numerically by (pairwise) relations. These relational data sets can be clustered by relational AO and by relational ACE (RACE). We consider two kinds of non-numerical patterns provided by the World Wide Web: document contents such as the text parts of web pages, and sequences of web pages visited by particular users, so-called web logs. The analysis of document contents is often called web content mining, and the analysis of log files with web page sequences is called web log mining. For both non-numerical pattern types (text and web page sequences) relational data sets can be automatically generated using the Levenshtein (edit) distance or using graph distances. The prototypes found for text data can be interpreted as keywords that serve for document classification and automatic archiving. The prototypes found for web page sequences can be interpreted as prototypical click streams that indicate typical user interests, and therefore serve as a basis for web content and web structure management. (C) 2002 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:217 / 236
页数:20
相关论文
共 50 条
  • [41] Massive Data Mining Algorithm for Web Text Based on Clustering Algorithm
    Luo, Nan-Chao
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2019, 23 (02) : 362 - 365
  • [42] Web Usage Mining in Tourism - A Query Term Analysis and Clustering Approach
    Pitman, Arthur
    Zanker, Markus
    Fuchs, Matthias
    Lexhagen, Maria
    INFORMATION AND COMMUNICATION TECHNOLOGIES IN TOURISM 2010, 2010, : 393 - +
  • [43] Based on support vector machines and new methods of clustering for Web mining
    Lv Yingli
    Zhang Xiaofeng
    Gu Yong
    FOURTH INTERNATIONAL CONFERENCE ON DIGITAL IMAGE PROCESSING (ICDIP 2012), 2012, 8334
  • [44] Web Personalization Services Based on Clustering And Contiguous Sequential Pattern Mining
    Cui, Wei
    Fang, Wei
    EIGHTH WUHAN INTERNATIONAL CONFERENCE ON E-BUSINESS, VOLS I-III, 2009, : 775 - 780
  • [45] Two-phase support vector clustering for multi-relational data mining
    Ling, P
    Wang, Y
    Lu, N
    Wang, JY
    Liang, S
    Zhou, CG
    2005 INTERNATIONAL CONFERENCE ON CYBERWORLDS, PROCEEDINGS, 2005, : 139 - 146
  • [46] Web navigation patterns mining based on clustering of paths and pages content
    Gang, F
    Ma, GS
    Jing, H
    ADVANCED WEB AND NETWORK TECHNOLOGIES, AND APPLICATIONS, PROCEEDINGS, 2006, 3842 : 857 - 860
  • [47] Improved K-MEAN Clustering Approach for Web Usage Mining
    Agrawal, Kiran
    Mishra, Ashish
    2009 SECOND INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN ENGINEERING AND TECHNOLOGY (ICETET 2009), 2009, : 1079 - 1081
  • [48] Towards adaptive web mining: Histograms and contexts in text data clustering
    Ciesielski, Krzysztof
    Klopotek, Mieczyslaw A.
    ADVANCES IN INTELLIGENT DATA ANALYSIS VII, PROCEEDINGS, 2007, 4723 : 284 - +
  • [49] Incremental Web Usage Mining Based on Active Ant Colony Clustering
    SHEN Jie~1
    2. Department of Computer Science
    Wuhan University Journal of Natural Sciences, 2006, (05) : 1081 - 1085
  • [50] Hybrid O(n√n) clustering for sequential web usage mining
    Yang, Jianhua
    Lee, Ickjai
    AI 2006: ADVANCES IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2006, 4304 : 1022 - +