Bayesian Deep Learning via Expectation Maximization and Turbo Deep Approximate Message Passing

被引:0
作者
Xu, Wei [1 ]
Liu, An [1 ]
Zhang, Yiting [1 ]
Lau, Vincent [2 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310027, Peoples R China
[2] Hong Kong Univ Sci & Technol, Dept ECE, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Bayes methods; Training; Deep learning; Signal processing algorithms; Message passing; Federated learning; Neurons; Bayesian deep learning; Bayesian federated learning; DNN model compression; expectation maximization; turbo deep approximate message passing;
D O I
10.1109/TSP.2024.3442858
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Efficient learning and model compression algorithm for deep neural network (DNN) is a key workhorse behind the rise of deep learning (DL). In this work, we propose a message passing-based Bayesian deep learning algorithm called EM-TDAMP to avoid the drawbacks of traditional stochastic gradient descent (SGD)-based learning algorithms and regularization-based model compression methods. Specifically, we formulate the problem of DNN learning and compression as a sparse Bayesian inference problem, in which group sparse prior is employed to achieve structured model compression. Then, we propose an expectation maximization (EM) framework to estimate posterior distributions for parameters (E-step) and update hyperparameters (M-step), where the E-step is realized by a newly proposed turbo deep approximate message passing (TDAMP) algorithm. We further extend the EM-TDAMP and propose a novel Bayesian federated learning framework, in which the clients perform TDAMP to efficiently calculate the local posterior distributions based on the local data, and the central server first aggregates the local posterior distributions to update the global posterior distributions and then update hyperparameters based on EM to accelerate convergence. We detail the application of EM-TDAMP to Boston housing price prediction and handwriting recognition, and present extensive numerical results to demonstrate the advantages of EM-TDAMP.
引用
收藏
页码:3865 / 3878
页数:14
相关论文
共 33 条
[1]   A Review on Bayesian Deep Learning in Healthcare: Applications and Challenges [J].
Abdullah, Abdullah A. ;
Hassan, Masoud M. ;
Mustafa, Yaseen T. .
IEEE ACCESS, 2022, 10 :36538-36562
[2]  
[Anonymous], 1984, Eur. J. Oper. Res., V16, P285
[3]   A Probabilistic Model Based-Tracking Method for mmWave Massive MIMO Channel Estimation [J].
Bai, Xudong ;
Peng, Qi .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (12) :16777-16782
[4]  
Briggs C., 2020, P INT JOINT C NEUR N, P9
[5]  
Choromanska A., 2015, P 18 INT C ART INT S, V38
[6]  
Hanin B, 2018, ADV NEUR IN, V31
[7]   The vanishing gradient problem during learning recurrent neural nets and problem solutions [J].
Hochreiter, S .
INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 1998, 6 (02) :107-116
[8]   Hands-On Bayesian Neural Networks-A Tutorial for Deep Learning Users [J].
Jospin, Laurent Valentin ;
Laga, Hamid ;
Boussaid, Farid ;
Buntine, Wray ;
Bennamoun, Mohammed .
IEEE COMPUTATIONAL INTELLIGENCE MAGAZINE, 2022, 17 (02) :29-48
[9]   Advances and Open Problems in Federated Learning [J].
Kairouz, Peter ;
McMahan, H. Brendan ;
Avent, Brendan ;
Bellet, Aurelien ;
Bennis, Mehdi ;
Bhagoji, Arjun Nitin ;
Bonawitz, Kallista ;
Charles, Zachary ;
Cormode, Graham ;
Cummings, Rachel ;
D'Oliveira, Rafael G. L. ;
Eichner, Hubert ;
El Rouayheb, Salim ;
Evans, David ;
Gardner, Josh ;
Garrett, Zachary ;
Gascon, Adria ;
Ghazi, Badih ;
Gibbons, Phillip B. ;
Gruteser, Marco ;
Harchaoui, Zaid ;
He, Chaoyang ;
He, Lie ;
Huo, Zhouyuan ;
Hutchinson, Ben ;
Hsu, Justin ;
Jaggi, Martin ;
Javidi, Tara ;
Joshi, Gauri ;
Khodak, Mikhail ;
Konecny, Jakub ;
Korolova, Aleksandra ;
Koushanfar, Farinaz ;
Koyejo, Sanmi ;
Lepoint, Tancrede ;
Liu, Yang ;
Mittal, Prateek ;
Mohri, Mehryar ;
Nock, Richard ;
Ozgur, Ayfer ;
Pagh, Rasmus ;
Qi, Hang ;
Ramage, Daniel ;
Raskar, Ramesh ;
Raykova, Mariana ;
Song, Dawn ;
Song, Weikang ;
Stich, Sebastian U. ;
Sun, Ziteng ;
Suresh, Ananda Theertha .
FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2021, 14 (1-2) :1-210
[10]  
Kim S., Ann. Appl. Statist., V6, P1117