SMARTCRAWLER: A PERSONALIZED WEB SEARCH FOR RELEVANT WEB PAGES

被引:0
作者
Wardekar, Arati Anilrao [1 ]
Gupta, Poonam [1 ]
机构
[1] GH Raisoni Coll Engn & Management, Pune 412207, Maharashtra, India
来源
2018 9TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT) | 2018年
关键词
Web Crawler; Inner web; URL Feature selection; IP; Site frequency; Two-stage crawler; Site Ranking; Personalized web search;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
On web we can see that web pages are not indexed by crawling in speed, it was developed many crawlers to efficiently locate inner web interfaces, due to the large amount of resources in the network and the dynamic nature of the deep web, the better result is a challenging problem. To solve this problem, we propose a two-stage framework, mainly SmartCrawler, to relevantly finding a deep web. Smart-crawler gets the seed from the seed database. First stage, Smart Crawler performs the "reverse search" that matches the user's query in the URLs. In the second step, the "Incremental Site Prioritize" is perform in which the content of the query in the form matches. Then, according to frequency matching, sort relevant and irrelevant pages and rank this page. High-ranking pages are displayed on the results page. Our proposed crawler efficiently recovers deep interfaces from large databases and achieves a higher result than other developed crawlers. We have propose a comprehensive and customized search to improve performance by considering how long we keep the log file. Before viewing the query before entering the query in the search box that is the focus, enter the search box.
引用
收藏
页数:4
相关论文
共 14 条
  • [1] A Comparative Study of Hidden WebCrawlers, 2014, INT J COMPUTER TREND, V12
  • [2] [Anonymous], 2017, CYB COMP INT CYBERNE, P56
  • [3] [Anonymous], 2017, PERANCANGAN APLIKASI
  • [4] [Anonymous], 2015, SCALABILITY CHALLENG
  • [5] Chakrabarti Soumen, 1999, FOCUSED CRAWLER NEW
  • [6] Gill A.B., 2013, PERSONALIZATION E CO
  • [7] Kabisch Thomas, 2010, P VLDBENDOWMENT, V3
  • [8] Lia Wenwen, 2010, ACTIVE CRAWLER DISCO
  • [9] Olston, 2010, FOUND TRENDS INF RET, V4
  • [10] Rahman Mahmudur, 2013, SEARCH ENGINES GOING