Information Filtering and Information Retrieval with the Web Filtering Toolbar

被引:0
作者
Silva, Josep [1 ]
机构
[1] Univ Politecn Valencia, Comp Sci Dept, Camino Vera S-N, E-46022 Valencia, Spain
关键词
Filtering; Web pages; !text type='HTML']HTML[!/text;
D O I
10.1016/j.entcs.2009.03.008
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Nowadays, Internet is the main source of information for millions of people and enterprises. However, the information in Internet has not been classified yet and, consequently, the search for information is one of the most important tasks and processes performed by users and systems. In particular, for WWW human users the search for information is the main (time-consuming) task performed. In order to face this problem both the industrial and the academic communities have developed many methods and tools to index and search Web pages. The most extended solution is the use of search engines such as Google and Y ahoo; however, while current search engines can be a suitable solution to find a particular Web page, they are useless to find the relevant information in such a page. Hence, once a Web page is found, the user must search on it in order to verify if the information needed is in there. This is a problem which until now has not been satisfactorily solved. In this paper we present a tool able to automatically extract from a Web page the information (text, images, etc.) related to a filtering criterion without the use of semantic specifications or indexes and without the need of offline parsing or compilation processes. This tool has been published as an extension for the Firefox's Web navigator.
引用
收藏
页码:125 / 136
页数:12
相关论文
共 11 条
[1]  
Baggi M., 2007, TECHNICAL REPORT
[2]   Content-based filtering of Web documents: the MaX system and the EUFORBIA project [J].
Elisa Bertino ;
Elena Ferrari ;
Andrea Perego .
International Journal of Information Security, 2003, 2 (1) :45-58
[3]   Web page filtering and re-authoring for mobile users [J].
Bickmore, T ;
Girgensohn, A ;
Sullivan, JW .
COMPUTER JOURNAL, 1999, 42 (06) :534-546
[4]  
Blachman N., GOOGLE GUIDE MAKING
[5]   WebGuard: A Web filtering engine combining textual, structural, and visual content-based analysis [J].
Hammami, M ;
Chahir, Y ;
Chen, LM .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (02) :272-284
[6]  
Hatcher E., 2004, ACTION SERIES
[7]  
Joachims T, 1997, INT JOINT CONF ARTIF, P770
[8]  
Marinilli M., 1999, P ICCBR 99 WORKSH SE, P23
[9]  
Pereira J., 2001, Proceedings of the 27th International Conference on Very Large Data Bases, P723
[10]   Web application slicing [J].
Ricca, F ;
Tonella, P .
IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE, PROCEEDINGS: SYSTEMS AND SOFTWARE EVOLUTION IN THE ERA OF THE INTERNET, 2001, :148-157