Efficient and effective Web change detection

被引:9
作者
Flesca, S
Masciari, E
机构
[1] Univ Calabria, Fac Engn, DEIS, I-87036 Arcavacata Di Rende, Italy
[2] CNR, ICAR, I-87036 Arcavacata Di Rende, Italy
关键词
update monitoring; continuous queries; WWW tools;
D O I
10.1016/S0169-023X(02)00210-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we present a new technique for detecting changes in Web documents. The technique is based on a new method to measure the similarity of two documents, that represent the actual and the previous version of the monitored page. The technique has been effectively used to discover changes in selected portions of the original document. The proposed technique has been implemented in the CMW system providing a change monitoring service on the Web. The main features of CMW are the detection of changes on selected portions of web documents and the possibility to express complex queries on the changed information. For instance, a query can require to check if the value of a given stock has increased by more than 10%. Several tests on stock exchange and auction web pages proved the effectiveness of the proposed approach. (C) 2002 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:203 / 224
页数:22
相关论文
共 23 条
[1]  
CHAWATHE S, 1996, P ACM SIGMOD INT C M, P493
[2]  
CHAWATHE S, 1997, P ACM SIGMOD INT C M, P26
[3]   Representing and querying changes in semistructured data [J].
Chawathe, SS ;
Abiteboul, S ;
Widom, J .
14TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1998, :4-13
[4]  
Douglis F, 1996, PROCEEDINGS OF THE USENIX 1996 ANNUAL TECHNICAL CONFERENCE, P165
[5]   The AT&T Internet Difference Engine: Tracking and viewing changes on the web [J].
Douglis F. ;
Ball T. ;
Chen Y.-F. ;
Koutsofios E. .
World Wide Web, 1998, 1 (1) :27-44
[6]  
Douglis F, 1996, COMPUT NETWORKS ISDN, V28, P1335, DOI 10.1016/0169-7552(96)00059-1
[7]  
Gionis A., 2001, SIGMOD
[8]  
Kuhn H.W., 1955, HUNGARIAN METHOD ASS, V2, P83, DOI [DOI 10.1002/NAV.3800020109, DOI 10.1002/NAV.20053]
[9]  
LIU L, 1998, P ACM SIGMOD INT C M
[10]  
LIU L, 2000, P CIKM 00 WASH DC US