Bulk GCD Computation Using a GPU to Break Weak RSA Keys

被引:5
作者
Fujita, Toru [1 ]
Nakano, Koji [1 ]
Ito, Yasuaki [1 ]
机构
[1] Hiroshima Univ, Dept Informat Engn, Kagamiyama 1-4-1, Higashihiroshima 7398527, Japan
来源
2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS | 2015年
关键词
MEMORY MACHINE; ALGORITHMS;
D O I
10.1109/IPDPSW.2015.54
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
RSA is one the most well-known public-key cryptosystems widely used for secure data transfer. An RSA encryption key includes a modulus n which is the product of two large prime numbers p and q. If an RSA modulus n can be decomposed into p and q, the corresponding decryption key can be computed easily from them and the original message can be obtained using it. RSA cryptosystem relies on the hardness of factorization of RSA modulus. Suppose that we have a lot of encryption keys collected from the Web. If some of them are inappropriately generated so that they share the same prime number, then they can be decomposed by computing their GCD (Greatest Common Divisor). Actually, a previously published investigation showed that a certain ratio of RSA moduli in encryption keys in the Web are sharing prime numbers. We may find such weak RSA moduli n by computing the GCD of many pairs of RSA moduli. The main contribution of this paper is to present a new Euclidean algorithm for computing the GCD of all pairs of encryption moduli. The idea of our new Euclidean algorithm that we call Approximate Euclidean algorithm is to compute an approximation of quotient by just one 64-bit division and to use it for reducing the number of iterations of the Euclidean algorithm. We also present an implementation of Approximate Euclidean algorithm optimized for CUDA-enabled GPUs. The experimental results show that our implementation for 1024-bit GCD on GeForce GTX 780Ti runs more than 80 times faster than the Intel Xeon CPU implementation. Further, our GPU implementation is more than 9 times faster than the best known published GCD computation using the same generation GPU.
引用
收藏
页码:385 / 394
页数:10
相关论文
共 23 条
  • [1] [Anonymous], 1997, The Art of Computer Programming: Seminumerical Algorithms
  • [2] [Anonymous], 2011, GPU Computing Gems Emerald Edition. Applications of GPU Computing Series
  • [3] [Anonymous], P INT C ALG ARCH PAR
  • [4] [Anonymous], 2012, NVIDIA CUDA C programming guide
  • [5] [Anonymous], NVIDIA CUDA C BEST P
  • [6] [Anonymous], THESIS
  • [7] [Anonymous], 2011, INT J NETW COMPUT
  • [8] CORMEN TH, 2001, INTRO ALGORITHMS
  • [9] High Throughput Multiple-Precision GCD on the CUDA Architecture
    Fujimoto, Noriyuki
    [J]. 2009 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT 2009), 2009, : 507 - 512
  • [10] An Optimal Offline Permutation Algorithm on the Hierarchical Memory Machine, with the GPU implementation
    Kasagi, Akihiko
    Nakano, Koji
    Ito, Yasuaki
    [J]. 2013 42ND ANNUAL INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2013, : 1 - 10