Demographic Information Inference through Meta-Data Analysis of Wi-Fi Traffic

被引:31
作者
Li, Huaxin [1 ]
Zhu, Haojin [1 ]
Ma, Di [2 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai Key Lab Scalable Comp & Syst, Shanghai 200240, Peoples R China
[2] Univ Michigan Dearborn, Coll Engn & Comp Sci, Dearborn, MI 48128 USA
基金
美国国家科学基金会;
关键词
Privacy leakage; traffic analysis; demographics inference;
D O I
10.1109/TMC.2017.2753244
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Privacy inference through meta-data (e.g., IP, Host) analysis of Wi-Fi traffic poses a potentially more serious threat to user privacy. First, it provides a more efficient and scalable approach to infer users' sensitive information without checking the content of Wi-Fi traffic. Second, meta-data based demographics inference can work on both unencrypted and encrypted traffic (e.g., HTTPS traffic). In this study, we present a novel approach to infer user demographic information by exploiting the meta-data of Wi-Fi traffic. We develop an inference framework based on machine learning and evaluate its performance on a real-world dataset, which includes the Wi-Fi access of 28,158 users in five months. The framework extracts four kinds of features from real-world Wi-Fi traffic and applies a novel machine learning technique (XGBoost) to predict user demographics. Our analytical results show that, the overall accuracy of inferring gender and education level of users can be 82 and 78 percent, respectively. It is surprising to show that, even for HTTPS traffic, user demographics can still be predicted at accuracy of 69 and 76 percent, respectively, which well demonstrates the practicality of the proposed privacy inference scheme. Finally, we discuss and evaluate potential mitigation methods for such inference attacks.
引用
收藏
页码:1033 / 1047
页数:15
相关论文
共 66 条
[1]  
[Anonymous], P IEEE INT C DISTR C
[2]  
[Anonymous], P SPRING INT WORKSH
[3]  
[Anonymous], 2009, Proceedings of the second ACM conference on Wireless network security
[4]  
[Anonymous], 2016, P NETW DISTR SYST SE
[5]  
[Anonymous], WHY USING PUBLIC WI
[6]  
[Anonymous], 2011, AS C MACH LEARN
[7]  
[Anonymous], US CAS LOC BAS ADV W
[8]  
[Anonymous], P NETW DISTR SYST SE
[9]  
[Anonymous], P SIAM INT C DAT MIN
[10]  
[Anonymous], HOTSP 2 0 REL 2 TECH