FilteredWeb: A Framework for the Automated Search-Based Discovery of Blocked URLs

被引:0
|
作者
Darer, Alexander [1 ]
Farnan, Oliver [1 ]
Wright, Joss [2 ]
机构
[1] Univ Oxford, Dept Comp Sci, Oxford, England
[2] Univ Oxford, Oxford Internet Inst, Oxford, England
来源
TMA CONFERENCE 2017 - PROCEEDINGS OF THE 1ST NETWORK TRAFFIC MEASUREMENT AND ANALYSIS CONFERENCE | 2017年
基金
英国工程与自然科学研究理事会;
关键词
censorship; filtering; DNS; Chinese Internet; search; CHINA; CENSORSHIP;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Various methods have been proposed for creating and maintaining lists of potentially filtered URLs to allow for measurement of ongoing internet censorship around the world. Whilst testing a known resource for evidence of filtering can be relatively simple, given appropriate vantage points, discovering previously unknown filtered web resources remains an open challenge. We present a novel framework for automating the process of discovering filtered resources through the use of adaptive queries to well-known search engines. Our system applies information retrieval algorithms to isolate characteristic linguistic patterns in known filtered web pages; these are used as the basis for web search queries. The resulting URLs of these searches are checked for evidence of filtering, and newly discovered blocked resources will be fed back into the system to detect further filtered content. Our implementation of this framework, applied to China as a case study, shows the approach is demonstrably effective at detecting significant numbers of previously unknown filtered web pages, making a significant contribution to the ongoing detection of internet filtering as it develops. When deployed, this system was used to discover 1355 poisoned domains within China as of Feb 2017-30 times more than in the most widely-used published filter list of the time. Of these, 759 are outside of the Alexa Top 1000 domains list, demonstrating the capability of this framework to find more obscure filtered content. Further, our initial analysis of filtered URLs, and the search terms that were used to discover them, gives further insight into the nature of the content currently being blocked in China.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Proficient Resource Mapping Framework in Clouds with Security and Search-Based Request
    Kiruthika, K.
    Saravanakumar, M.
    Rajendran, T.
    2014 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING APPLICATIONS (ICICA 2014), 2014, : 81 - 84
  • [32] Manifold-Inspired Search-Based Algorithm for Automated Test Case Generation
    Liu, Fangqing
    Huang, Han
    Su, Junpeng
    Semujju, Stuart Dereck
    Yang, Zhongming
    Hao, Zhifeng
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2022, 10 (02) : 1075 - 1090
  • [33] Human Capital and Search-Based Discovery: A Study of High-Tech Entrepreneurship
    Marvel, Matthew R.
    ENTREPRENEURSHIP THEORY AND PRACTICE, 2013, 37 (02) : 403 - 419
  • [34] Automated Co-evolution of Metamodels and Transformation Rules: A Search-Based Approach
    Kessentini, Wael
    Sahraoui, Houari
    Wimmer, Manuel
    SEARCH-BASED SOFTWARE ENGINEERING, SSBSE 2018, 2018, 11036 : 229 - 245
  • [35] Automated Assistance for Search-Based Refactoring Using Unfolding of Graph Transformation Systems
    Qayum, Fawad
    GRAPH TRANSFORMATIONS, 2010, 6372 : 407 - 409
  • [36] Automated Migration of Build Scripts using Dynamic Analysis and Search-Based Refactoring
    Gligoric, Milos
    Schulte, Wolfram
    Prasad, Chandra
    van Velzen, Danny
    Narasamdya, Iman
    Livshits, Benjamin
    ACM SIGPLAN NOTICES, 2014, 49 (10) : 599 - 616
  • [37] Search-based Efficient Automated Program Repair Using Mutation and Fault localization
    Sun, Shuyao
    Guo, Junxia
    Zhao, Ruilian
    Li, Zheng
    2018 IEEE 42ND ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 1, 2018, : 174 - 183
  • [38] An Automated Hyperparameter Search-Based Deep Learning Model for Highway Traffic Prediction
    Yi, Hongsuk
    Bui, Khac-Hoai Nam
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (09) : 5486 - 5495
  • [39] Automated migration of build scripts using dynamic analysis and search-based refactoring
    Gligoric, Milos
    Schulte, Wolfram
    Prasad, Chandra
    Van Velzen, Danny
    Narasamdya, Iman
    Livshits, Benjamin
    ACM SIGPLAN Notices, 2014, 49 (10): : 599 - 616
  • [40] Automated Repair of Layout Cross Browser Issues using Search-Based Techniques
    Mahajan, Sonal
    Alameer, Abdulmajeed
    McMinn, Phil
    Halfond, William G. J.
    PROCEEDINGS OF THE 26TH ACM SIGSOFT INTERNATIONAL SYMPOSIUM ON SOFTWARE TESTING AND ANALYSIS (ISSTA'17), 2017, : 249 - 260