Zipf's Law in Passwords

被引:407
作者
Wang, Ding [1 ,2 ]
Cheng, Haibo [1 ,2 ]
Wang, Ping [3 ,4 ]
Huang, Xinyi [5 ]
Jian, Gaopeng [1 ,2 ]
机构
[1] Peking Univ, Sch EECS, Beijing 100871, Peoples R China
[2] Minist Educ, Key Lab High Confidence Software Technol, Beijing 100871, Peoples R China
[3] Peking Univ, Natl Engn Res Ctr Software Engn, Beijing 100871, Peoples R China
[4] Peking Univ, Sch Software & Microelect, Beijing 100871, Peoples R China
[5] Fujian Normal Univ, Sch Math & Comp Sci, Fuzhou 350007, Fujian, Peoples R China
基金
中国国家自然科学基金;
关键词
User authentication; password distribution; Zipf's law; strength metric; password policy; SECURITY; DISTRIBUTIONS;
D O I
10.1109/TIFS.2017.2721359
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Despite three decades of intensive research efforts, it remains an open question as to what is the underlying distribution of user-generated passwords. In this paper, we make a substantial step forward toward understanding this foundational question. By introducing a number of computational statistical techniques and based on 14 large-scale data sets, which consist of 113.3 million real-world passwords, we, for the first time, propose two Zipf-like models (i.e., PDF-Zipf and CDF-Zipf) to characterize the distribution of passwords. More specifically, our PDF-Zipf model can well fit the popular passwords and obtain a coefficient of determination larger than 0.97; our CDF-Zipf model can well fit the entire password data set, with the maximum cumulative distribution function (CDF) deviation between the empirical distribution and the fitted theoretical model being 0.49%similar to 4.59% (on an average 1.85%). With the concrete knowledge of password distributions, we suggest a new metric for measuring the strength of password data sets. Extensive experimental results show the effectiveness and general applicability of the proposed Zipf-like models and security metric.
引用
收藏
页码:2776 / 2791
页数:16
相关论文
共 59 条
[1]   Security of the J-PAKE Password-Authenticated Key Exchange Protocol [J].
Abdalla, Michel ;
Benhamouda, Fabrice ;
MacKenzie, Philip .
2015 IEEE SYMPOSIUM ON SECURITY AND PRIVACY SP 2015, 2015, :571-587
[2]   Public-Key Encryption Indistinguishable Under Plaintext-Checkable Attacks [J].
Abdalla, Michel ;
Benhamouda, Fabrice ;
Pointcheval, David .
PUBLIC-KEY CRYPTOGRAPHY - PKC 2015, 2015, 9020 :332-352
[3]   Error and attack tolerance of complex networks [J].
Albert, R ;
Jeong, H ;
Barabási, AL .
NATURE, 2000, 406 (6794) :378-382
[4]  
[Anonymous], AMID WIDESPREAD DATA
[5]  
[Anonymous], 2009, 32 MILLION ROCKYOU P
[6]  
[Anonymous], 2012, P 21 INT C WORLD WID
[7]  
[Anonymous], 2013, P 2013 ACM SIGSAC C, DOI [DOI 10.1145/2508859.2516726, 10.1145/2508859.2516726]
[8]  
[Anonymous], 2010, 2010 Proceedings IEEE INFOCOM
[9]  
[Anonymous], 2012, 21 USENIX SEC S USEN
[10]  
Blocki J., 2016, P IEEE CSF SEP, P1