Semantic Domain Specific Search Engine

被引:1
作者
Chandrashekar, B. H. [1 ]
Shobha, G. [2 ]
机构
[1] RV Coll Engn, Dept Master Comp Applicat, Bangalore 560059, Karnataka, India
[2] RV Coll Engn, Dept Comp Sci & Engn, Bangalore 560059, Karnataka, India
来源
2010 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND AUTOMATION ENGINEERING (ICCAE 2010), VOL 2 | 2010年
关键词
WebCrawler; domain; lexicon; webpage; Internet;
D O I
10.1109/ICCAE.2010.5451718
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The World-Wide-Web (WWW) is growing exponentially and has become increasingly difficult to retrieve relevant information on the web. The rapid growth of the WWW poses unprecedented scaling challenges for general purpose crawlers and search engines. In this paper we describe a new hypertext resource discovery system called topic specific crawler. The goal of this crawler is to selectively seek out pages that are relevant to a predefined set of topics, rather than collecting and indexing all accessible web documents to be able to answer all possible ad-hoc queries. A topic specific crawler analyses its crawl boundary to find the links that are likely to be most relevant for the crawl. This leads to significant savings in hardware and network resources, and helps keep the crawl more up-to-date.
引用
收藏
页码:669 / 672
页数:4
相关论文
共 9 条
[1]  
Altingovde Ismail Sengor, AUTOMATIC APPROACH C
[2]  
Brin S., 1998, 7 INT WORLD WIDE WEB
[3]  
Chakrabarthi S., FOCUSED CRAWLING NEW
[4]  
CHAU M, 2001, P 1 ACM IEEE CS JOIN, P79
[5]  
Cho J, P 7 INT WWW C
[6]  
Mauldin M.I., IEEE EXPERT, V12, P8
[7]  
McCallum A., 1999, AAAI SPRING S INT AG
[8]  
Najork Marc, 2001, WWW 10
[9]  
Peshave Monica., SEARCH ENGINES WORK