More Realistic Website Fingerprinting Using Deep Learning

被引:8
作者
Cui, Weiqi [1 ]
Chen, Tao [1 ]
Chan-Tin, Eric [2 ]
机构
[1] Oklahoma State Univ, Comp Sci Dept, Stillwater, OK 74078 USA
[2] Loyola Univ, Dept Comp Sci, Chicago, IL 60611 USA
来源
2020 IEEE 40TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS) | 2020年
关键词
Privacy; Website Fingerprinting; Anonymity; Deep Learning; Practicality;
D O I
10.1109/ICDCS47774.2020.00058
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Website fingerprinting (WF) allows a passive local eavesdropper to monitor the encrypted channel where users search the Internet and determine which website the user is visiting from the recorded traffic. The effectiveness of using deep learning (DL) in WF attacks has been explored in recent work. However, they all are built and evaluated on one-page traces. Our goal is to explore whether deep learning can be used to handle the situations when the captured traces are not best-case for an adversary, such as partial traces and two-page traces. We aim to reduce the distance between the lab experiments and the realistic conditions. We evaluate our proposed method in both closed-world and open-world settings and found that Convolutional Neural Network (CNN) outperforms Long-Short Term Memory network (LSTM) in all scenarios. CNN also shows a great potential in predicting on a smaller number of packets. For partial trace missing 20% packets in the beginning of the trace, the accuracy is improved from 8.28% to 86.93% compared to the original DL model by adding the head detection. We then show the accuracy of predicting on two-page traces. With an overlap of 80% between two websites, we are able to achieve an accuracy of 89.25% and 74.2% for the first and second website in the closed-world evaluation, and 95.5% and 75% in the open world from our simulation. To verify our simulation results, we set up a crawler to collect both training and testing data and gathered the largest two-page traces testing dataset ever used. The results shown in the real world experiment is consistent with the simulation.
引用
收藏
页码:333 / 343
页数:11
相关论文
共 31 条
  • [1] Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
  • [2] Abusnaina A, 2020, IEEE INFOCOM SER, P2459, DOI [10.1109/infocom41043.2020.9155465, 10.1109/INFOCOM41043.2020.9155465]
  • [3] [Anonymous], 2015, STANDFORD CS224D REP
  • [4] Bhat Sanjit, 2019, Proceedings on Privacy Enhancing Technologies, V2019, P292, DOI 10.2478/popets-2019-0070
  • [5] Cai X., 2014, Proceedings of the 13th Workshop on Privacy in the Electronic Society, WPES'14, P121
  • [6] Cai X., 2012, ACM Conference on Computer and Communications Security (CCS '12), P605
  • [7] A Systematic Approach to Developing and Evaluating Website Fingerprinting Defenses
    Cai, Xiang
    Nithyanand, Rishab
    Wang, Tao
    Johnson, Rob
    Goldberg, Ian
    [J]. CCS'14: PROCEEDINGS OF THE 21ST ACM CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2014, : 227 - 238
  • [8] Chollet F., 2015, KERAS
  • [9] Revisiting Assumptions for Website Fingerprinting Attacks
    Cui, Weiqi
    Chen, Tao
    Fields, Christian
    Chen, Julianna
    Sierra, Anthony
    Chan-Tin, Eric
    [J]. PROCEEDINGS OF THE 2019 ACM ASIA CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY (ASIACCS '19), 2019, : 328 - 339
  • [10] Realistic Cover Traffic to Mitigate Website Fingerprinting Attacks
    Cui, Weiqi
    Yu, Jiangmin
    Gong, Yanmin
    Chan-Tin, Eric
    [J]. 2018 IEEE 38TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS (ICDCS), 2018, : 1579 - 1584