EMACrawler: web search engine database freshness optimization

被引:0
作者
Alanoglu, Zuelfue [1 ]
Akcayol, M. Ali [2 ]
机构
[1] Hatay Mustafa Kemal Univ, Antakya Meslek Yuksek Okulu, Bilgisayar Teknolojileri Bolumu, Antakya, Turkiye
[2] Gazi Univ, Muhendislik Fak, Bilgisayar Muhendisligi Bolumu, Ankara, Turkiye
来源
JOURNAL OF POLYTECHNIC-POLITEKNIK DERGISI | 2024年 / 27卷 / 06期
关键词
Web crawler; update module; data collection; data indexing;
D O I
10.2339/politeknik.1347054
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
In today's information and technology age, search engines have become an important part of our lives. However, search engines are the first to be used to access information, old and unnecessary information is included in the content offered to users. Regarding providing up-to-date data, today's search engines often cannot offer the desired success. In order to keep the data presented by web browsers up-to-date, the time of return visits must be accurately estimated. In this study, EMACrawler based on exponential moving averages is proposed to determine the revisit times, which is the most important feature that affects the performance of search engines. The proposed method is tested using precision, total coverage, and efficiency metrics. It has been seen that EMACrawler obtains the current data on the web pages accurately and quickly. As a result of the experimental studies, it has been seen that EMACrawler is more successful than other methods in obtaining up-to-date data and maintaining the freshness of the browser database.
引用
收藏
页数:16
相关论文
共 36 条
[31]   Keyword weight optimization using gradient strategies in event focused web crawling [J].
Rajiv, S. ;
Navaneethan, C. .
PATTERN RECOGNITION LETTERS, 2021, 142 :3-10
[32]   Data-driven model for hydraulic fracturing design optimization: focus on building digital database and production forecast [J].
Morozov, Anton D. ;
Popkov, Dmitry O. ;
Duplyakov, Victor M. ;
Mutalova, Renata F. ;
Osiptsov, Andrei A. ;
Vainshtein, Albert L. ;
Burnaev, Evgeny, V ;
Shel, Egor, V ;
Paderin, Grigory, V .
JOURNAL OF PETROLEUM SCIENCE AND ENGINEERING, 2020, 194
[33]   https://www.mocoda2.de: a database and web-based editing environment for collecting and refining a corpus of mobile messaging interactions [J].
Beisswenger, Michael ;
Imo, Wolfgang ;
Fladrich, Marcel ;
Ziegler, Evelyn .
EUROPEAN JOURNAL OF APPLIED LINGUISTICS, 2019, 7 (02) :333-344
[34]   Effective Sampling From Social Media Sites and Search Engines for Web Surveys: Demographic and Data Quality Differences in Surveys of Google and Facebook Users [J].
Stern, Michael J. ;
Bilgen, Ipek ;
McClain, Colleen ;
Hunscher, Brian .
SOCIAL SCIENCE COMPUTER REVIEW, 2017, 35 (06) :713-732
[35]   iCapS-MS: an improved Capuchin Search Algorithm-based mobile-sink sojourn location optimization and data collection scheme for Wireless Sensor Networks [J].
Al Aghbari, Zaher ;
Raj, P. V. Pravija ;
Mostafa, Reham R. ;
Khedr, Ahmed M. .
NEURAL COMPUTING & APPLICATIONS, 2024, 36 (15) :8501-8517
[36]   iCapS-MS: an improved Capuchin Search Algorithm-based mobile-sink sojourn location optimization and data collection scheme for Wireless Sensor Networks [J].
Zaher Al Aghbari ;
P V Pravija Raj ;
Reham R. Mostafa ;
Ahmed M. Khedr .
Neural Computing and Applications, 2024, 36 :8501-8517