Optimized Parameter-Efficient Deep Learning Systems via Reversible Jump Simulated Annealing

被引：0

作者：

Marsh, Peter ^{[1
]}

Kuruoglu, Ercan Engin ^{[1
]}

机构：

[1] Univ Town Shenzhen, Tsinghua Berkeley Shenzhen Inst, Tsinghua Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2024年 / 18卷 / 06期

关键词：

Simulated annealing; Optimization; Neural networks; Long short term memory; Image recognition; Task analysis; Data models; Deep learning systems; model selection; reversible jump Markov chain Monte Carlo; simulated annealing; NEURAL-NETWORK; ALGORITHM;

D O I：

10.1109/JSTSP.2024.3428355

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We utilize the non-convex optimization method simulated annealing enriched with reversible jumps to enable a model selection capacity for deep learning models in a model size aware context. By using simulated annealing enriched with reversible jumps, we can yield a robust stochastic learning of the hidden posterior distribution of the structure, simultaneously constructing a more focused and certain estimate of the structure, all while making use of all the data. Being based upon Markov-chain learning methods, we constructed our priors to favor smaller and simpler architectures, allowing us to converge on the set of globally optimal models that are additionally parameter-efficient, seeking low parameter count deep models that retain good predictive accuracy. We demonstrate the capability on standard image recognition with CIFAR-10, as well as performing model selection on time-series tasks, realizing networks with competitive performance as compared to competing non-convex optimization methods such as genetic algorithms, random search, and Gaussian process based Bayesian optimization, while being less than half the size.

引用

页码：1010 / 1023

页数：14

共 50 条

[31] Learning to recognize (un)promising simulated annealing runs: Efficient search procedures for job shop scheduling and vehicle routing
Sadeh, NM
Nakakuki, Y
Thangiah, SR
ANNALS OF OPERATIONS RESEARCH, 1997, 75 (0) : 189 - 208
[32] Learning of interval and general type-2 fuzzy logic systems using simulated annealing: Theory and practice
Almaraashi, M.
John, R.
Hopgood, A.
Ahmadi, S.
INFORMATION SCIENCES, 2016, 360 : 21 - 42
[33] Swarm intelligence-based deep ensemble learning machine for efficient channel estimation in MIMO communication systems
Manasa, B. M. R.
Venugopal, P.
INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS, 2022, 35 (10)
[34] Sequential simulated annealing for life-cycle optimization of nonlinear stochastic systems via arbitrary polynomial chaos expansion
dos Santos, Ketson R. M.
Beck, Andre Teofilo
Lopez, Rafael Holdorf
ENGINEERING STRUCTURES, 2024, 304
[35] FedITD: A Federated Parameter-Efficient Tuning With Pre-Trained Large Language Models and Transfer Learning Framework for Insider Threat Detection
Wang, Zhi Qiang
Wang, Haopeng
El Saddik, Abdulmotaleb
IEEE ACCESS, 2024, 12 : 160396 - 160417
[36] Speech enhancement via adaptive Wiener filtering and optimized deep learning framework
Jadda, Amarendra
Prabha, Inty Santi
INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2023, 21 (01)
[37] Parameter estimation of fractional-order arbitrary dimensional hyperchaotic systems via a hybrid adaptive artificial bee colony algorithm with simulated annealing algorithm
Hu, Wei
Yu, Yongguang
Gu, Wenjuan
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2018, 68 : 172 - 191
[38] Developing Learning Algorithms via Optimized Discretization of Continuous Dynamical Systems
Tao, Qing
Sun, Zhengya
Kong, Kang
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2012, 42 (01): : 140 - 149
[39] A cross-layer approach towards developing efficient embedded Deep Learning systems
Hanif, Muhammad Abdullah
Shafique, Muhammad
MICROPROCESSORS AND MICROSYSTEMS, 2022, 88
[40] Optimized YOLOV8: An efficient underwater litter detection using deep learning
Rehman, Faiza
Rehman, Mariam
Anjum, Maria
Hussain, Afzaal
AIN SHAMS ENGINEERING JOURNAL, 2025, 16 (01)

← 1 2 3 4 5 →