GPU-based Private Information Retrieval for On-Device Machine Learning Inference

被引：0

作者：

Lam, Maximilian ^{[2
]}

Johnson, Jeff ^{[1
]}

Xiong, Wenjie ^{[3
]}

Maeng, Kiwan ^{[4
]}

Gupta, Udit ^{[6
]}

Li, Yang ^{[1
]}

Lai, Liangzhen ^{[1
]}

Leontiadis, Ilias ^{[1
]}

Rhu, Minsoo ^{[1
]}

Lee, Hsien-Hsin S. ^{[5
]}

Reddi, Vijay Janapa ^{[2
]}

Wei, Gu-Yeon ^{[2
]}

Brooks, David ^{[2
]}

Suh, G. Edward ^{[1
,6
]}

机构：

[1] Meta AI, Menlo Pk, CA 94025 USA

[2] Harvard Univ, Cambridge, MA 02138 USA

[3] Virginia Tech, Blacksburg, VA USA

[4] Penn State, University Pk, PA USA

[5] Intel, Santa Clara, CA USA

[6] Cornell Univ, Ithaca, NY USA

来源：

PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON ARCHITECTURAL SUPPORT FOR PROGRAMMING LANGUAGES AND OPERATING SYSTEMS, ASPLOS 2024, VOL 1 | 2024年

关键词：

privacy; security; cryptography; machine learning; GPU; performance; PROTECTION;

D O I：

10.1145/3617232.3624855

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

On-device machine learning (ML) inference can enable the use of private user data on user devices without revealing them to remote servers. However, a pure on-device solution to private ML inference is impractical for many applications that rely on embedding tables that are too large to be stored on-device. In particular, recommendation models typically use multiple embedding tables each on the order of 1-10 GBs of data, making them impractical to store on-device. To over-come this barrier, we propose the use of private information retrieval (PIR) to efficiently and privately retrieve embeddings from servers without sharing any private information. As off-the-shelf PIR algorithms are usually too computationally intensive to directly use for latency-sensitive inference tasks, we 1) propose novel GPU-based acceleration of PIR, and 2) co-design PIR with the downstream ML application to obtain further speedup. Our GPU acceleration strategy improves system throughput by more than 20x over an optimized CPU PIR implementation, and our PIR-ML co-design provides an over 5x additional throughput improvement at fixed model quality. Together, for various on-device ML applications such as recommendation and language modeling, our system on a single V100 GPU can serve up to 100, 000 queries per second a > 100x throughput improvement over a CPU-based baseline-while maintaining model accuracy.

引用

页码：197 / 214

页数：18

共 50 条

[21] Accelerated Machine Learning for On-Device Hardware-Assisted Cybersecurity in Edge Platforms
Makrani, Hosein Mohammadi
He, Zhangying
Rafatirad, Setareh
Sayadi, Hossein
PROCEEDINGS OF THE TWENTY THIRD INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN (ISQED 2022), 2022, : 77 - 83
[22] GPU-based MapReduce for large-scale near-duplicate video retrieval
Hanli Wang
Fengkuangtian Zhu
Bo Xiao
Lei Wang
Yu-Gang Jiang
Multimedia Tools and Applications, 2015, 74 : 10515 - 10534
[23] An Efficient and Accurate GPU-based Deep Learning Model for Multimedia Recommendation
Djenouri, Youcef
Belhadi, Asma
Srivastava, Gautam
Lin, Jerry Chun-Wei
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (02)
[24] Comparison of Deep Learning in Neural Networks on CPU and GPU-based frameworks
Aida-zade, Kamil
Mustafayev, Elshan
Rustamov, Samir
2017 11TH IEEE INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT 2017), 2017, : 95 - 98
[25] GPU-based Stereo Matching Algorithm with the Strategy of Population-based Incremental Learning
Nie, Dong-Hu
Han, Kyu-Phil
Lee, Heng-Suk
JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2009, 5 (02): : 105 - 116
[26] GPU-Accelerated Machine Learning Inference as a Service for Computing in Neutrino Experiments
Wang, Michael
Yang, Tingjun
Flechas, Maria Acosta
Harris, Philip
Hawks, Benjamin
Holzman, Burt
Knoepfel, Kyle
Krupa, Jeffrey
Pedro, Kevin
Tran, Nhan
FRONTIERS IN BIG DATA, 2021, 3
[27] A novel image model for vehicle classification in restricted areas using on-device machine learning
Lamba A.
Kumar V.
International Journal of Information Technology, 2023, 15 (6) : 3037 - 3043
[28] On-Device Machine Learning for Diagnosis of Parkinson's Disease from Hand Drawn Artifacts
Venkata, Sai Vaibhav Polisetti
Sabat, Shubhankar
Deshpande, Chinmay Anand
Arefeen, Asiful
Peterson, Daniel
Ghasemzadeh, Hassan
2022 IEEE-EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS (BHI'22) JOINTLY ORGANISED WITH THE IEEE-EMBS INTERNATIONAL CONFERENCE ON WEARABLE AND IMPLANTABLE BODY SENSOR NETWORKS (BSN'22), 2022,
[29] Applying machine learning to text segmentation for information retrieval
Huang, XJ
Peng, FC
Schuurmans, D
Cercone, N
Robertson, SE
INFORMATION RETRIEVAL, 2003, 6 (3-4): : 333 - 362
[30] On Machine Learning and Knowledge Organization in Multimedia Information Retrieval
Macfarlane, Andrew
Missaoui, Sondess
Frankowska-Takhari, Sylwia
KNOWLEDGE ORGANIZATION, 2020, 47 (01): : 45 - 55

← 1 2 3 4 5 →