GPU-based Private Information Retrieval for On-Device Machine Learning Inference

被引:0
|
作者
Lam, Maximilian [2 ]
Johnson, Jeff [1 ]
Xiong, Wenjie [3 ]
Maeng, Kiwan [4 ]
Gupta, Udit [6 ]
Li, Yang [1 ]
Lai, Liangzhen [1 ]
Leontiadis, Ilias [1 ]
Rhu, Minsoo [1 ]
Lee, Hsien-Hsin S. [5 ]
Reddi, Vijay Janapa [2 ]
Wei, Gu-Yeon [2 ]
Brooks, David [2 ]
Suh, G. Edward [1 ,6 ]
机构
[1] Meta AI, Menlo Pk, CA 94025 USA
[2] Harvard Univ, Cambridge, MA 02138 USA
[3] Virginia Tech, Blacksburg, VA USA
[4] Penn State, University Pk, PA USA
[5] Intel, Santa Clara, CA USA
[6] Cornell Univ, Ithaca, NY USA
来源
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, ASPLOS 2024, VOL 1 | 2024年
关键词
privacy; security; cryptography; machine learning; GPU; performance; PROTECTION;
D O I
10.1145/3617232.3624855
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the order of 1-10 GBs of data, making them impractical to store on-device. To over-come this barrier, we propose the use of private information retrieval (PIR) to efficiently and privately retrieve embeddings from servers without sharing any private information. As off-the-shelf PIR algorithms are usually too computationally intensive to directly use for latency-sensitive inference tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR with the downstream ML application to obtain further speedup. Our GPU acceleration strategy improves system throughput by more than 20x over an optimized CPU PIR implementation, and our PIR-ML co-design provides an over 5x additional throughput improvement at fixed model quality. Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to 100, 000 queries per second a > 100x throughput improvement over a CPU-based baseline-while maintaining model accuracy.
引用
收藏
页码:197 / 214
页数:18
相关论文
共 50 条
  • [1] SODA: Protecting Proprietary Information in On-Device Machine Learning Models
    Atrey, Akanksha
    Sinha, Ritwik
    Mitra, Saayan
    Shenoy, Prashant
    2023 IEEE/ACM SYMPOSIUM ON EDGE COMPUTING, SEC 2023, 2023, : 121 - 132
  • [2] Machine Learning on Mobile: An On-device Inference App for Skin Cancer Detection
    Dai, Xiangfeng
    Spasic, Irena
    Meyer, Bradley
    Chapman, Samuel
    Andres, Frederic
    2019 FOURTH INTERNATIONAL CONFERENCE ON FOG AND MOBILE EDGE COMPUTING (FMEC), 2019, : 301 - 305
  • [3] A GPU-based machine learning approach for detection of botnet attacks
    Motylinski, Michal
    MacDermott, Aine
    Iqbal, Farkhund
    Shah, Babar
    COMPUTERS & SECURITY, 2022, 123
  • [4] Accelerating Support Vector Machine Learning with GPU-based MapReduce
    Sun, Tianyao
    Wang, Hanli
    Shen, Yun
    Wu, Jun
    2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 876 - 881
  • [5] A Survey of On-Device Machine Learning: An Algorithms and Learning Theory Perspective
    Dhar, Sauptik
    Guo, Junyao
    Liu, Jiayi
    Tripathi, Samarth
    Kurup, Unmesh
    Shah, Mohak
    ACM TRANSACTIONS ON INTERNET OF THINGS, 2021, 2 (03):
  • [6] TensorFlow Lite: On-Device Machine Learning Framework
    Li S.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (09): : 1839 - 1853
  • [7] GCI: A GPU-Based Transfer Learning Approach for Detecting Cheats of Computer Game
    Islam, Md Shihabul
    Dong, Bo
    Chandra, Swarup
    Khan, Latifur
    Thuraisingham, Bhavani
    IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2022, 19 (02) : 804 - 816
  • [8] On-Device Training of Machine Learning Models on Microcontrollers with Federated Learning
    Llisterri Gimenez, Nil
    Monfort Grau, Marc
    Pueyo Centelles, Roger
    Freitag, Felix
    ELECTRONICS, 2022, 11 (04)
  • [9] Enabling On-Device Smartphone GPU based Training: Lessons Learned
    Das, Anish
    Kwon, Young D.
    Chauhan, Jagmohan
    Mascolo, Cecilia
    2022 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS AND OTHER AFFILIATED EVENTS (PERCOM WORKSHOPS), 2022,
  • [10] A review of on-device machine learning for IoT: An energy perspective
    Tekin, Nazli
    Aris, Ahmet
    Acar, Abbas
    Uluagac, Selcuk
    Gungor, Vehbi Cagri
    AD HOC NETWORKS, 2024, 153