Optimized Parameter-Efficient Deep Learning Systems via Reversible Jump Simulated Annealing

被引:0
|
作者
Marsh, Peter [1 ]
Kuruoglu, Ercan Engin [1 ]
机构
[1] Univ Town Shenzhen, Tsinghua Berkeley Shenzhen Inst, Tsinghua Shenzhen Int Grad Sch, Shenzhen 518055, Peoples R China
关键词
Simulated annealing; Optimization; Neural networks; Long short term memory; Image recognition; Task analysis; Data models; Deep learning systems; model selection; reversible jump Markov chain Monte Carlo; simulated annealing; NEURAL-NETWORK; ALGORITHM;
D O I
10.1109/JSTSP.2024.3428355
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We utilize the non-convex optimization method simulated annealing enriched with reversible jumps to enable a model selection capacity for deep learning models in a model size aware context. By using simulated annealing enriched with reversible jumps, we can yield a robust stochastic learning of the hidden posterior distribution of the structure, simultaneously constructing a more focused and certain estimate of the structure, all while making use of all the data. Being based upon Markov-chain learning methods, we constructed our priors to favor smaller and simpler architectures, allowing us to converge on the set of globally optimal models that are additionally parameter-efficient, seeking low parameter count deep models that retain good predictive accuracy. We demonstrate the capability on standard image recognition with CIFAR-10, as well as performing model selection on time-series tasks, realizing networks with competitive performance as compared to competing non-convex optimization methods such as genetic algorithms, random search, and Gaussian process based Bayesian optimization, while being less than half the size.
引用
收藏
页码:1010 / 1023
页数:14
相关论文
共 50 条
  • [21] An efficient simulated annealing algorithm for network reconfiguration in large scale distribution systems
    Jeon, YJ
    Kim, JC
    Kim, JO
    Shin, JR
    Lee, KY
    IEEE TRANSACTIONS ON POWER DELIVERY, 2002, 17 (04) : 1070 - 1078
  • [22] Simulated Annealing with Deep Learning Based Tongue Image Analysis for Heart Disease Diagnosis
    Sivasubramaniam, S.
    Balamurugan, S. P.
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (01): : 111 - 126
  • [23] Globally optimized Fourier finite-difference operator using simulated annealing algorithm based on multi-parameter
    Zhu Sue-Wei
    Zhang Jin-Hai
    Yao Zhen-Xing
    CHINESE JOURNAL OF GEOPHYSICS-CHINESE EDITION, 2008, 51 (06): : 1844 - 1850
  • [24] Deep reinforcement learning method based on DDPG with simulated annealing for satellite attitude control system
    Su, Ruipeng
    Wu, Fengge
    Zhao, Junsuo
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 390 - 395
  • [25] A computationally efficient technique for source identification problems in three-dimensional aquifer systems using neural networks and simulated annealing
    Rao, S. V. N.
    ENVIRONMENTAL FORENSICS, 2006, 7 (03) : 233 - 240
  • [26] Adaptive control of Markov jump distributed parameter systems via model reference
    Ji, Huihui
    Cui, Baotong
    Liu, Xinzhi
    FUZZY SETS AND SYSTEMS, 2020, 392 : 115 - 135
  • [27] An Efficient Deep Learning Model for Automatic Modulation Recognition Based on Parameter Estimation and Transformation
    Zhang, Fuxin
    Luo, Chunbo
    Xu, Jialang
    Luo, Yang
    IEEE COMMUNICATIONS LETTERS, 2021, 25 (10) : 3287 - 3290
  • [28] Parameter estimation for chaotic systems using a hybrid adaptive cuckoo search with simulated annealing algorithm
    Sheng, Zheng
    Wang, Jun
    Zhou, Shudao
    Zhou, Bihua
    CHAOS, 2014, 24 (01)
  • [29] Efficient design method for terahertz broadband metasurface patterns via deep learning
    Teng, Yan
    Li, Chun
    Li, Shaochen
    Xiao, Yuhua
    Jiang, Ling
    OPTICS AND LASER TECHNOLOGY, 2023, 160
  • [30] Elongation Prediction of Steel-strips in Annealing Furnace with Deep Learning via Improved Incremental Extreme Learning Machine
    Wang, Chao
    Wang, Jian-Hui
    Gu, Shu-Sheng
    Wang, Xiao
    Zhang, Yu-Xian
    INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2017, 15 (03) : 1466 - 1477