Optimized Parameter-Efficient Deep Learning Systems via Reversible Jump Simulated Annealing

被引：0

作者：

Marsh, Peter ^{[1
]}

Kuruoglu, Ercan Engin ^{[1
]}

机构：

[1] Univ Town Shenzhen, Tsinghua Berkeley Shenzhen Inst, Tsinghua Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China

来源：

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING | 2024年 / 18卷 / 06期

关键词：

Simulated annealing; Optimization; Neural networks; Long short term memory; Image recognition; Task analysis; Data models; Deep learning systems; model selection; reversible jump Markov chain Monte Carlo; simulated annealing; NEURAL-NETWORK; ALGORITHM;

D O I：

10.1109/JSTSP.2024.3428355

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We utilize the non-convex optimization method simulated annealing enriched with reversible jumps to enable a model selection capacity for deep learning models in a model size aware context. By using simulated annealing enriched with reversible jumps, we can yield a robust stochastic learning of the hidden posterior distribution of the structure, simultaneously constructing a more focused and certain estimate of the structure, all while making use of all the data. Being based upon Markov-chain learning methods, we constructed our priors to favor smaller and simpler architectures, allowing us to converge on the set of globally optimal models that are additionally parameter-efficient, seeking low parameter count deep models that retain good predictive accuracy. We demonstrate the capability on standard image recognition with CIFAR-10, as well as performing model selection on time-series tasks, realizing networks with competitive performance as compared to competing non-convex optimization methods such as genetic algorithms, random search, and Gaussian process based Bayesian optimization, while being less than half the size.

引用

页码：1010 / 1023

页数：14

共 50 条

[21] An efficient simulated annealing algorithm for network reconfiguration in large scale distribution systems
Jeon, YJ
Kim, JC
Kim, JO
Shin, JR
Lee, KY
IEEE TRANSACTIONS ON POWER DELIVERY, 2002, 17 (04) : 1070 - 1078
[22] Simulated Annealing with Deep Learning Based Tongue Image Analysis for Heart Disease Diagnosis
Sivasubramaniam, S.
Balamurugan, S. P.
INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (01): : 111 - 126
[23] Globally optimized Fourier finite-difference operator using simulated annealing algorithm based on multi-parameter
Zhu Sue-Wei
Zhang Jin-Hai
Yao Zhen-Xing
CHINESE JOURNAL OF GEOPHYSICS-CHINESE EDITION, 2008, 51 (06): : 1844 - 1850
[24] Deep reinforcement learning method based on DDPG with simulated annealing for satellite attitude control system
Su, Ruipeng
Wu, Fengge
Zhao, Junsuo
2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 390 - 395
[25] A computationally efficient technique for source identification problems in three-dimensional aquifer systems using neural networks and simulated annealing
Rao, S. V. N.
ENVIRONMENTAL FORENSICS, 2006, 7 (03) : 233 - 240
[26] Adaptive control of Markov jump distributed parameter systems via model reference
Ji, Huihui
Cui, Baotong
Liu, Xinzhi
FUZZY SETS AND SYSTEMS, 2020, 392 : 115 - 135
[27] An Efficient Deep Learning Model for Automatic Modulation Recognition Based on Parameter Estimation and Transformation
Zhang, Fuxin
Luo, Chunbo
Xu, Jialang
Luo, Yang
IEEE COMMUNICATIONS LETTERS, 2021, 25 (10) : 3287 - 3290
[28] Parameter estimation for chaotic systems using a hybrid adaptive cuckoo search with simulated annealing algorithm
Sheng, Zheng
Wang, Jun
Zhou, Shudao
Zhou, Bihua
CHAOS, 2014, 24 (01)
[29] Efficient design method for terahertz broadband metasurface patterns via deep learning
Teng, Yan
Li, Chun
Li, Shaochen
Xiao, Yuhua
Jiang, Ling
OPTICS AND LASER TECHNOLOGY, 2023, 160
[30] Elongation Prediction of Steel-strips in Annealing Furnace with Deep Learning via Improved Incremental Extreme Learning Machine
Wang, Chao
Wang, Jian-Hui
Gu, Shu-Sheng
Wang, Xiao
Zhang, Yu-Xian
INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2017, 15 (03) : 1466 - 1477

← 1 2 3 4 5 →