End-to-End Deep Policy Feedback-Based Reinforcement Learning Method for Quantization in DNNs

被引:1
作者
Babu, R. Logesh [1 ]
Gurumoorthy, Sasikumar [2 ]
Parameshachari, B. D. [3 ]
Nelson, S. Christalin [4 ]
Hua, Qiaozhi [5 ]
机构
[1] Madanapalle Inst Technol & Sci, Dept Comp Sci & Engn, Chittoor 517325, Andhra Pradesh, India
[2] Jerusalem Coll Engn, Dept Comp Sci & Engn, Chennai 600100, Tamil Nadu, India
[3] GSSS Inst Engn & Technol Women, Dept Telecommun Engn, Mysuru 570011, Karnataka, India
[4] Univ Petr & Energy Studies UPES, Sch Comp Sci, Dept Syst Cluster, Dehra Dun 248007, Uttarakhand, India
[5] Hubei Univ Arts & Sci, Sch Comp, Xiangyang 441000, Hubei, Peoples R China
关键词
Constrained embedded systems; deep neural networks; long short-term memory network; policy feedback; proximal policy optimization technique; reinforcement learning method; NEURAL ARCHITECTURE SEARCH; EFFICIENCY; ACCURACY; NETWORKS;
D O I
10.1142/S0218126622502322
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the resource-constrained embedded systems, the designing of efficient deep neural networks is a challenging process, due to diversity in the artificial intelligence applications. The quantization in deep neural networks superiorly diminishes the storage and computational time by reducing the bit-width of networks encoding. In order to highlight the problem of accuracy loss, the quantization levels are automatically discovered using Policy Feedback-based Reinforcement Learning Method (PF-RELEQ). In this paper, the Proximal Policy Optimization with Policy Feedback (PPO-PF) technique is proposed to determine the best design decisions by choosing the optimum hyper-parameters. In order to enhance the sensitivity of the value function to the change of policy and to improve the accuracy of value estimation at the early learning stage, a policy update method is devised based on the clipped discount factor. In addition, specifically the loss functions of policy satisfy the unbiased estimation of the trust region. The proposed PF-RELEQ effectively balances quality and speed compared to other deep learning methods like ResNet-1202, ResNet-32, ResNet-110, GoogLeNet and AlexNet. The experimental analysis showed that PF-RELEQ achieved 20% computational work-load reduction compared to the existing deep learning methods on ImageNet, CIFAR-10, CIFAR-100 and tomato leaf disease datasets and achieved approximately 2% of improvisation in the validation accuracy. Additionally, the PF-RELEQ needs only 0.55 Graphics Processing Unit on an NVIDIA GTX-1080Ti to develop DNNs that delivers better accuracy improvement with fewer cycle counts for image classification.
引用
收藏
页数:25
相关论文
共 55 条
  • [1] APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators
    Achararit, Paniti
    Hanif, Muhammad Abdullah
    Putra, Rachmad Vidya Wicaksana
    Shafique, Muhammad
    Hara-Azumi, Yuko
    [J]. IEEE ACCESS, 2020, 8 : 165319 - 165334
  • [2] ToLeD: Tomato Leaf Disease Detection using Convolution Neural Network
    Agarwal, Mohit
    Singh, Abhishek
    Arjaria, Siddhartha
    Sinha, Amit
    Gupta, Suneet
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 293 - 301
  • [3] Methodologies of Compressing a Stable Performance Convolutional Neural Networks in Image Classification
    Al-Hami, Mo'taz
    Pietron, Marcin
    Casas, Raul
    Wielgosz, Maciej
    [J]. NEURAL PROCESSING LETTERS, 2020, 51 (01) : 105 - 127
  • [4] Deep feature augmentation for occluded image classification
    Cen, Feng
    Zhao, Xiaoyu
    Li, Wuzhuang
    Wang, Guanghui
    [J]. PATTERN RECOGNITION, 2021, 111
  • [5] A Survey of Accelerator Architectures for Deep Neural Networks
    Chen, Yiran
    Xie, Yuan
    Song, Linghao
    Chen, Fan
    Tang, Tianqi
    [J]. ENGINEERING, 2020, 6 (03) : 264 - 274
  • [6] USING DATAFLOW TO OPTIMIZE ENERGY EFFICIENCY OF DEEP NEURAL NETWORK ACCELERATORS
    Chen, Yu-Hsin
    Emer, Joel
    Sze, Vivienne
    [J]. IEEE MICRO, 2017, 37 (03) : 12 - 21
  • [7] ELM based architecture for general purpose automatic weight and structure learning
    de Andrade, Douglas Coimbra
    Trabasso, Luis Gonzaga
    [J]. NEUROCOMPUTING, 2018, 275 : 804 - 817
  • [8] Anti-Forensics for Face Swapping Videos via Adversarial Training
    Ding, Feng
    Zhu, Guopu
    Li, Yingcan
    Zhang, Xinpeng
    Atrey, Pradeep K.
    Lyu, Siwei
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 24 : 3429 - 3441
  • [9] Perceptual Enhancement for Autonomous Vehicles: Restoring Visually Degraded Images for Context Prediction via Adversarial Training
    Ding, Feng
    Yu, Keping
    Gu, Zonghua
    Li, Xiangjun
    Shi, Yunqing
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (07) : 9430 - 9441
  • [10] Deep-Learning-Empowered Digital Forensics for Edge Consumer Electronics in 5G HetNets
    Ding, Feng
    Zhu, Guopu
    Alazab, Mamoun
    Li, Xiangjun
    Yu, Keping
    [J]. IEEE CONSUMER ELECTRONICS MAGAZINE, 2022, 11 (02) : 42 - 50