Optimized Parameter-Efficient Deep Learning Systems via Reversible Jump Simulated Annealing

被引：0

作者：

Marsh, Peter ^{[1
]}

Kuruoglu, Ercan Engin ^{[1
]}

机构：

[1] Univ Town Shenzhen, Tsinghua Berkeley Shenzhen Inst, Tsinghua Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2024年 / 18卷 / 06期

关键词：

Simulated annealing; Optimization; Neural networks; Long short term memory; Image recognition; Task analysis; Data models; Deep learning systems; model selection; reversible jump Markov chain Monte Carlo; simulated annealing; NEURAL-NETWORK; ALGORITHM;

D O I：

10.1109/JSTSP.2024.3428355

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We utilize the non-convex optimization method simulated annealing enriched with reversible jumps to enable a model selection capacity for deep learning models in a model size aware context. By using simulated annealing enriched with reversible jumps, we can yield a robust stochastic learning of the hidden posterior distribution of the structure, simultaneously constructing a more focused and certain estimate of the structure, all while making use of all the data. Being based upon Markov-chain learning methods, we constructed our priors to favor smaller and simpler architectures, allowing us to converge on the set of globally optimal models that are additionally parameter-efficient, seeking low parameter count deep models that retain good predictive accuracy. We demonstrate the capability on standard image recognition with CIFAR-10, as well as performing model selection on time-series tasks, realizing networks with competitive performance as compared to competing non-convex optimization methods such as genetic algorithms, random search, and Gaussian process based Bayesian optimization, while being less than half the size.

引用

页码：1010 / 1023

页数：14

共 50 条

[1] Multi-UAV reconnaissance mission planning via deep reinforcement learning with simulated annealing
Fan, Mingfeng
Liu, Huan
Wu, Guohua
Gunawan, Aldy
Sartoretti, Guillaume
SWARM AND EVOLUTIONARY COMPUTATION, 2025, 93
[2] Robust Multimodal Learning With Missing Modalities via Parameter-Efficient Adaptation
Reza, Md Kaykobad
Prater-Bennette, Ashley
Asif, M. Salman
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (02) : 742 - 754
[3] Simulated annealing using a Reversible Jump Markov Chain Monte Carlo algorithm for fuzzy clustering
Bandyopadhyay, S
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (04) : 479 - 490
[4] An efficient simulated annealing algorithm for stochastic systems
Alkhamis, Talal M.
KUWAIT JOURNAL OF SCIENCE & ENGINEERING, 2006, 33 (02): : 47 - 68
[5] Estimating the Efficient Parameter Values of Different Neighborhood Search Techniques of Simulated Annealing in Forest Spatial Planning Problems
Dong, Lingbo
Tian, Dongyuan
Lu, Wei
Liu, Zhaogang
IEEE ACCESS, 2020, 8 : 115905 - 115921
[6] Simulated Annealing Algorithm for Deep Learning
Rere, L. M. Rasdi
Fanany, Mohamad Ivan
Arymurthy, Aniati Murni
THIRD INFORMATION SYSTEMS INTERNATIONAL CONFERENCE 2015, 2015, 72 : 137 - 144
[7] Parameter-Efficient Transfer Learning for Medical Visual Question Answering
Liu, Jiaxiang
Hu, Tianxiang
Zhang, Yan
Feng, Yang
Hao, Jin
Lv, Junhui
Liu, Zuozhu
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2816 - 2826
[8] AiRs: Adapter in Remote Sensing for Parameter-Efficient Transfer Learning
Hu, Leiyi
Yu, Hongfeng
Lu, Wanxuan
Yin, Dongshuo
Sun, Xian
Fu, Kun
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 18
[9] A deep reinforcement learning assisted simulated annealing algorithm for a maintenance planning problem
Kosanoglu, Fuat
Atmis, Mahir
Turan, Hasan Huseyin
ANNALS OF OPERATIONS RESEARCH, 2024, 339 (1-2) : 79 - 110
[10] Optimized simulated annealing for efficient generation of highly nonlinear S-boxes
Kuznetsov, Alexandr
Poluyanenko, Nikolay
Frontoni, Emanuele
Kandiy, Sergey
Pieshkova, Olha
SOFT COMPUTING, 2024, 28 (05) : 3905 - 3920

← 1 2 3 4 5 →