Optimized Parameter-Efficient Deep Learning Systems via Reversible Jump Simulated Annealing

被引:0
|
作者
Marsh, Peter [1 ]
Kuruoglu, Ercan Engin [1 ]
机构
[1] Univ Town Shenzhen, Tsinghua Berkeley Shenzhen Inst, Tsinghua Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
关键词
Simulated annealing; Optimization; Neural networks; Long short term memory; Image recognition; Task analysis; Data models; Deep learning systems; model selection; reversible jump Markov chain Monte Carlo; simulated annealing; NEURAL-NETWORK; ALGORITHM;
D O I
10.1109/JSTSP.2024.3428355
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We utilize the non-convex optimization method simulated annealing enriched with reversible jumps to enable a model selection capacity for deep learning models in a model size aware context. By using simulated annealing enriched with reversible jumps, we can yield a robust stochastic learning of the hidden posterior distribution of the structure, simultaneously constructing a more focused and certain estimate of the structure, all while making use of all the data. Being based upon Markov-chain learning methods, we constructed our priors to favor smaller and simpler architectures, allowing us to converge on the set of globally optimal models that are additionally parameter-efficient, seeking low parameter count deep models that retain good predictive accuracy. We demonstrate the capability on standard image recognition with CIFAR-10, as well as performing model selection on time-series tasks, realizing networks with competitive performance as compared to competing non-convex optimization methods such as genetic algorithms, random search, and Gaussian process based Bayesian optimization, while being less than half the size.
引用
收藏
页码:1010 / 1023
页数:14
相关论文
共 50 条
  • [1] Multi-UAV reconnaissance mission planning via deep reinforcement learning with simulated annealing
    Fan, Mingfeng
    Liu, Huan
    Wu, Guohua
    Gunawan, Aldy
    Sartoretti, Guillaume
    SWARM AND EVOLUTIONARY COMPUTATION, 2025, 93
  • [2] Robust Multimodal Learning With Missing Modalities via Parameter-Efficient Adaptation
    Reza, Md Kaykobad
    Prater-Bennette, Ashley
    Asif, M. Salman
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (02) : 742 - 754
  • [3] Simulated annealing using a Reversible Jump Markov Chain Monte Carlo algorithm for fuzzy clustering
    Bandyopadhyay, S
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (04) : 479 - 490
  • [4] An efficient simulated annealing algorithm for stochastic systems
    Alkhamis, Talal M.
    KUWAIT JOURNAL OF SCIENCE & ENGINEERING, 2006, 33 (02): : 47 - 68
  • [5] Estimating the Efficient Parameter Values of Different Neighborhood Search Techniques of Simulated Annealing in Forest Spatial Planning Problems
    Dong, Lingbo
    Tian, Dongyuan
    Lu, Wei
    Liu, Zhaogang
    IEEE ACCESS, 2020, 8 : 115905 - 115921
  • [6] Simulated Annealing Algorithm for Deep Learning
    Rere, L. M. Rasdi
    Fanany, Mohamad Ivan
    Arymurthy, Aniati Murni
    THIRD INFORMATION SYSTEMS INTERNATIONAL CONFERENCE 2015, 2015, 72 : 137 - 144
  • [7] Parameter-Efficient Transfer Learning for Medical Visual Question Answering
    Liu, Jiaxiang
    Hu, Tianxiang
    Zhang, Yan
    Feng, Yang
    Hao, Jin
    Lv, Junhui
    Liu, Zuozhu
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2816 - 2826
  • [8] AiRs: Adapter in Remote Sensing for Parameter-Efficient Transfer Learning
    Hu, Leiyi
    Yu, Hongfeng
    Lu, Wanxuan
    Yin, Dongshuo
    Sun, Xian
    Fu, Kun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 18
  • [9] A deep reinforcement learning assisted simulated annealing algorithm for a maintenance planning problem
    Kosanoglu, Fuat
    Atmis, Mahir
    Turan, Hasan Huseyin
    ANNALS OF OPERATIONS RESEARCH, 2024, 339 (1-2) : 79 - 110
  • [10] Optimized simulated annealing for efficient generation of highly nonlinear S-boxes
    Kuznetsov, Alexandr
    Poluyanenko, Nikolay
    Frontoni, Emanuele
    Kandiy, Sergey
    Pieshkova, Olha
    SOFT COMPUTING, 2024, 28 (05) : 3905 - 3920