Password Guessing Based on Recurrent Neural Networks and Generative Adversarial Networks

被引：0

作者：

Wang D. ^{[1
,2
]}

Zou Y.-K. ^{[1
,2
]}

Tao Y. ^{[3
]}

Wang B. ^{[3
]}

机构：

[1] College of Cyber Science, Nankai University, Tianjin

[2] Tianjin Key Laboratory of Network and Data Security Technology (Nankai University), Tianjin

[3] School of Electronics Engineering and Computer Science, Peking University, Beijing

来源：

Jisuanji Xuebao/Chinese Journal of Computers | 2021年 / 44卷 / 08期

关键词：

Deep learning; Generative adversarial network; Guessing attack; Password; Recursive neural network;

D O I：

10.11897/SP.J.1016.2021.01519

中图分类号：

学科分类号：

摘要：

The progress of deep learning technology provides a potential way to improve the efficiency of password cracking. At present, there have been researches on applying deep learning models such as Recurrent Neural Networks(RNN) and Generative Adversarial Networks (GAN) to password guessing. Based on the implementation of password guessing algorithms such as RNN, PL(the combination of Probabilistic Context Free Grammar(PCFG) and Long Short-Term Memory(LSTM) at the model level) and GAN, this paper uses RNN instead of LSTM in the PL model and proposes the PR model (the combination of PCFG and RNN). To reduce the dependence of the guessing model on large training samples, we use the RNN network to generate the filling set of the letter segment of password, and propose the PR+model. In the experiments, we use 4 different data sets to test the cracking ability of different models. The results show that PR model is slightly higher than PL model and significantly higher than traditional main-stream models, i.e., PCFG (<107 guesses) and Markov (<106 guesses), in most data sets. At the same time, the training efficiency of PR model is far better than that of PL model. Due to the different characteristics of password samples generated by different models, we further adopt combinations of different guess sets to perform the same test process based on 4 real large-scale password datasets. To the best of our knowledge, we have confirmed for the first time that the combined guess set of different models is higher than that of the single guess set under the same guess number (107-108 guesses). While for the GAN model, when the guess number is 3.6×108, the cracking rate is only 31.41%. This indicates that the cracking rates of GAN is inferior to traditional statistics-based methods (such as PCFG and Markov) and RNN-based models, and we further explain the reason. © 2021, Science Press. All right reserved.

引用

页码：1519 / 1534

页数：15

共 38 条

[1] Mark K, Benjamin S, Paul J S., The usability of passphrases for authentication: An empirical field study, International Journal of Human-Computer Studies, 65, 1, pp. 17-28, (2007)
[2] Bonneau J., The science of guessing: Analyzing an anonymized corpus of 70 million passwords, Proceedings of the IEEE Symposium on Security and Privacy, pp. 538-552, (2012)
[3] Das A, Bonneau J, Caesar M, Et al., The tangled Web of password reuse, Proceedings of the Network and Distributed System Security Symposium, pp. 23-26, (2014)
[4] Dell'Amico M, Michiardi P, Roudier Y., Password strength: An empirical analysis, Proceedings of the IEEE International Conference on Computer Communications, pp. 1-9, (2010)
[5] Ma J, Yang W, Luo M, Et al., A study of probabilistic password models, Proceedings of the IEEE Symposium on Security and Privacy, pp. 689-704, (2014)
[6] Troy H., Here's Why [Insert Thing Here] Is Not a Password Killer, 11
[7] Chatterjee R, Athayle A, Akhawe D, Et al., pASSWORD tYPOS and how to correct them securely, Proceedings of the 2016 IEEE Symposium on Security and Privacy, pp. 799-818, (2016)
[8] Wang Ping, Wang Ding, Huang Xin-Yi, Advances in password security, Journal of Computer Research and Development, 53, 10, pp. 2173-2188, (2016)
[9] 7
[10] 5

← 1 2 3 4 →