Hybrid Focused Crawling for Homemade Explosives Discovery on Surface and Dark Web

被引:14
作者
Iliou, Christos [1 ]
Kalpakis, George [1 ]
Tsikrika, Theodora [1 ]
Vrochidis, Stefanos [1 ]
Kompatsiaris, Ioannis [1 ]
机构
[1] CERTH, Inst Informat Technol, Thessaloniki, Greece
来源
PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY, (ARES 2016) | 2016年
关键词
focused crawling; Dark Web; darknets; Tor; I2P; Freenet;
D O I
10.1109/ARES.2016.66
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This work proposes a generic focused crawling framework for discovering resources on any given topic that reside on the Surface or the Dark Web. The proposed crawler is able to seamlessly traverse the Surface Web and several darknets present in the Dark Web (i.e. Tor, I2P and Freenet) during a single crawl by automatically adapting its crawling behavior and its classifier-guided hyperlink selection strategy based on the network type. This hybrid focused crawler is demonstrated for the discovery of Web resources containing recipes for producing homemade explosives. The evaluation experiments indicate the effectiveness of the proposed approach both for the Surface and the Dark Web.
引用
收藏
页码:229 / 234
页数:6
相关论文
共 12 条
[1]  
[Anonymous], 2011, DARK WEB EXPLORING D
[2]   Content and popularity analysis of Tor hidden services [J].
Biryukov, Alex ;
Pustogarov, Ivan ;
Thill, Fabrice ;
Weinmann, Ralf-Philipp .
2014 IEEE 34TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS WORKSHOPS (ICDCSW), 2014, :188-193
[3]   Uncovering the Dark Web: A case study of Jihad on the web [J].
Chen, Hsinchun ;
Chung, Wingyan ;
Qin, Jialun ;
Reid, Edna ;
Sageman, Marc ;
Weimann, Gabriel .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2008, 59 (08) :1347-1359
[4]  
Ciancaglini V, 2015, THE DEEP WEB
[5]   A Focused Crawler for Dark Web Forums [J].
Fu, Tianjun ;
Abbasi, Ahmed ;
Chen, Hsinchun .
JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2010, 61 (06) :1213-1231
[6]  
Kalpakis G., 18 INT C HU IN PRESS
[7]   Concept Detection in Multimedia Web Resources about Home Made Explosives [J].
Kalpakis, George ;
Tsikrika, Theodora ;
Markatopoulou, Foteini ;
Pittaras, Nikiforos ;
Vrochidis, Stefanos ;
Mezaris, Vasileios ;
Patras, Ioannis ;
Kompatsiaris, Ioannis .
PROCEEDINGS 10TH INTERNATIONAL CONFERENCE ON AVAILABILITY, RELIABILITY AND SECURITY ARES 2015, 2015, :632-641
[8]   Web crawling [J].
Olston C. ;
Najork M. .
Foundations and Trends in Information Retrieval, 2010, 4 (03) :175-246
[9]   Link contexts in classifier-guided topical crawlers [J].
Pant, G ;
Srinivasan, P .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2006, 18 (01) :107-122
[10]   Learning to crawl: Comparing classification schemes [J].
Pant, G ;
Srinivasan, P .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2005, 23 (04) :430-462