Towards IP-based Geolocation via Fine-grained and Stable Webcam Landmarks

被引:13
作者
Wang, Zhihao [1 ]
Li, Qiang [2 ]
Song, Jinke [2 ]
Wang, Haining [3 ]
Sun, Limin [1 ]
机构
[1] Chinese Acad Sci, Univ Chinese Acad Sci, Sch Cyber Secur, Inst Informat Engn, Beijing, Peoples R China
[2] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing, Peoples R China
[3] Virginia Polytech Inst & State Univ, Dept Elect & Comp Engn, Arlington, VA USA
来源
WEB CONFERENCE 2020: PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE (WWW 2020) | 2020年
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Internet of Things; Data Mining; Webcam; Information Extraction; IP Geolocation; Landmarks; INTERNET;
D O I
10.1145/3366423.3380216
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
IP-based geolocation is essential for various location-aware Internet applications, such as online advertisement, content delivery, and online fraud prevention. Achieving accurate geolocation enormously relies on the number of high-quality (i.e., the fine-grained and stable over time) landmarks. However, the previous efforts of garnering landmarks have been impeded by the limited visible landmarks on the Internet and manual time cost. In this paper, we leverage the availability of numerous online webcams that are used to monitor physical surroundings as a rich source of promising high-quality landmarks for serving IP-based geolocation. In particular, we present a new framework called GeoCAM, which is designed to automatically generate qualified landmarks from online webcams, providing IP-based geolocation services with high accuracy and wide coverage. GeoCAM periodically monitors websites that are hosting live webcams and uses the natural language processing technique to extract the IP addresses and latitude/longitude of webcams for generating landmarks at large-scale. We develop a prototype of GeoCAM and conduct real-world experiments for validating its efficacy. Our results show that GeoCam can detect 282,902 live webcams hosted in webpages with 94.2% precision and 90.4% recall, and then generate 16,863 stable and fine-grained landmarks, which are two orders of magnitude more than the landmarks used in prior works. Thus, by correlating a large scale of landmarks, GeoCAM is able to provide a geolocation service with high accuracy and wide coverage.
引用
收藏
页码:1422 / 1432
页数:11
相关论文
共 32 条
[1]  
[Anonymous], 2004, BEAUTIFUL SOUP
[2]  
BGP Routing Table Analysis, 1999, BGP ROUTING TABLE AN
[3]  
Common Crawl, 2010, COMM CRAWL
[4]  
Digital Envoy, 1999, NETACUITY
[5]  
Feng X, 2018, PROCEEDINGS OF THE 27TH USENIX SECURITY SYMPOSIUM, P327
[6]   A Look at Router Geolocation in Public and Commercial Databases [J].
Gharaibeh, Manaf ;
Shah, Anant ;
Huffaker, Bradley ;
Zhang, Han ;
Ensafi, Roya ;
Papadopoulos, Christos .
PROCEEDINGS OF THE 2017 INTERNET MEASUREMENT CONFERENCE (IMC'17), 2017, :463-469
[7]  
Gueye Bamba, 2006, IEEE ACM T NETWORK, V14, P1219
[8]   Mining the Web and the Internet for Accurate IP Address Geolocations [J].
Guo, Chuanxiong ;
Liu, Yunxin ;
Shen, Wenchao ;
Wang, Helen J. ;
Yu, Qing ;
Zhang, Yongguang .
IEEE INFOCOM 2009 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, VOLS 1-5, 2009, :2841-2845
[9]  
Hanemann A, 2005, LECT NOTES COMPUT SC, V3826, P241
[10]   DRoP: DNS-based Router Positioning [J].
Huffaker, Bradley ;
Fomenkov, Marina ;
Claffy, Kc .
ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2014, 44 (03) :5-13