GPU-based Private Information Retrieval for On-Device Machine Learning Inference

被引：0

作者：

Lam, Maximilian ^{[2
]}

Johnson, Jeff ^{[1
]}

Xiong, Wenjie ^{[3
]}

Maeng, Kiwan ^{[4
]}

Gupta, Udit ^{[6
]}

Li, Yang ^{[1
]}

Lai, Liangzhen ^{[1
]}

Leontiadis, Ilias ^{[1
]}

Rhu, Minsoo ^{[1
]}

Lee, Hsien-Hsin S. ^{[5
]}

Reddi, Vijay Janapa ^{[2
]}

Wei, Gu-Yeon ^{[2
]}

Brooks, David ^{[2
]}

Suh, G. Edward ^{[1
,6
]}

机构：

[1] Meta AI, Menlo Pk, CA 94025 USA

[2] Harvard Univ, Cambridge, MA 02138 USA

[3] Virginia Tech, Blacksburg, VA USA

[4] Penn State, University Pk, PA USA

[5] Intel, Santa Clara, CA USA

[6] Cornell Univ, Ithaca, NY USA

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, ASPLOS 2024, VOL 1 | 2024年

关键词：

privacy; security; cryptography; machine learning; GPU; performance; PROTECTION;

D O I：

10.1145/3617232.3624855

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the order of 1-10 GBs of data, making them impractical to store on-device. To over-come this barrier, we propose the use of private information retrieval (PIR) to efficiently and privately retrieve embeddings from servers without sharing any private information. As off-the-shelf PIR algorithms are usually too computationally intensive to directly use for latency-sensitive inference tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR with the downstream ML application to obtain further speedup. Our GPU acceleration strategy improves system throughput by more than 20x over an optimized CPU PIR implementation, and our PIR-ML co-design provides an over 5x additional throughput improvement at fixed model quality. Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to 100, 000 queries per second a > 100x throughput improvement over a CPU-based baseline-while maintaining model accuracy.

引用

页码：197 / 214

页数：18

共 50 条

[1] SODA: Protecting Proprietary Information in On-Device Machine Learning Models
Atrey, Akanksha
Sinha, Ritwik
Mitra, Saayan
Shenoy, Prashant
2023 IEEE/ACM SYMPOSIUM ON EDGE COMPUTING, SEC 2023, 2023, : 121 - 132
[2] Machine Learning on Mobile: An On-device Inference App for Skin Cancer Detection
Dai, Xiangfeng
Spasic, Irena
Meyer, Bradley
Chapman, Samuel
Andres, Frederic
2019 FOURTH INTERNATIONAL CONFERENCE ON FOG AND MOBILE EDGE COMPUTING (FMEC), 2019, : 301 - 305
[3] A GPU-based machine learning approach for detection of botnet attacks
Motylinski, Michal
MacDermott, Aine
Iqbal, Farkhund
Shah, Babar
COMPUTERS & SECURITY, 2022, 123
[4] Accelerating Support Vector Machine Learning with GPU-based MapReduce
Sun, Tianyao
Wang, Hanli
Shen, Yun
Wu, Jun
2015 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2015): BIG DATA ANALYTICS FOR HUMAN-CENTRIC SYSTEMS, 2015, : 876 - 881
[5] A Survey of On-Device Machine Learning: An Algorithms and Learning Theory Perspective
Dhar, Sauptik
Guo, Junyao
Liu, Jiayi
Tripathi, Samarth
Kurup, Unmesh
Shah, Mohak
ACM TRANSACTIONS ON INTERNET OF THINGS, 2021, 2 (03):
[6] TensorFlow Lite: On-Device Machine Learning Framework
Li S.
Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2020, 57 (09): : 1839 - 1853
[7] GCI: A GPU-Based Transfer Learning Approach for Detecting Cheats of Computer Game
Islam, Md Shihabul
Dong, Bo
Chandra, Swarup
Khan, Latifur
Thuraisingham, Bhavani
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2022, 19 (02) : 804 - 816
[8] On-Device Training of Machine Learning Models on Microcontrollers with Federated Learning
Llisterri Gimenez, Nil
Monfort Grau, Marc
Pueyo Centelles, Roger
Freitag, Felix
ELECTRONICS, 2022, 11 (04)
[9] Enabling On-Device Smartphone GPU based Training: Lessons Learned
Das, Anish
Kwon, Young D.
Chauhan, Jagmohan
Mascolo, Cecilia
2022 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS AND OTHER AFFILIATED EVENTS (PERCOM WORKSHOPS), 2022,
[10] A review of on-device machine learning for IoT: An energy perspective
Tekin, Nazli
Aris, Ahmet
Acar, Abbas
Uluagac, Selcuk
Gungor, Vehbi Cagri
AD HOC NETWORKS, 2024, 153

← 1 2 3 4 5 →