Accurate Low-Bit Length Floating-Point Arithmetic with Sorting Numbers

被引:0
作者
Alireza Dehghanpour
Javad Khodamoradi Kordestani
Masoud Dehyadegari
机构
[1] K. N. Toosi University of Technology,Faculty of Computer Engineering
[2] Institute for Research in Fundamental Sciences (IPM),School of Computer Science
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Deep neural networks; Floating point; Sorting; AlexNet; Convolutional neural networks;
D O I
暂无
中图分类号
学科分类号
摘要
A 32-bit floating-point format is often used for the development and training of deep neural networks. Training and inference in deep learning-optimized codecs can result in enormous performance and energy efficiency advantages. However, training and inferring low-bit neural networks still pose a significant challenge. In this study, we propose a sorting method that maintains accuracy in numerical formats with a low number of bits. We tested this method on convolutional neural networks, including AlexNet. Using our method, we found that in our convolutional neural network, the accuracy achieved with 11 bits matches that of the IEEE 32-bit format. Similarly, in AlexNet, the accuracy achieved with 10 bits matches that of the IEEE 32-bit format. These results suggest that the sorting method shows promise for calculations with limited accuracy.
引用
收藏
页码:12061 / 12078
页数:17
相关论文
共 50 条
  • [41] Low-resource low-latency hybrid adaptive CORDIC with floating-point precision
    Hong-Thu Nguyen
    Xuan-Thuan Nguyen
    Trong-Thuc Hoang
    Duc-Hung Le
    Cong-Kha Pham
    IEICE ELECTRONICS EXPRESS, 2015, 12 (09):
  • [42] A new floating-point adder FPGA-based implementation using RN-coding of numbers
    Araujo, Tulio
    Cardoso, Matheus B. R.
    Nepomuceno, Erivelton G.
    Llanos, Carlos H.
    Arias-Garcia, Janier
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 90
  • [43] FP-BMAC: Efficient Approximate Floating-Point Bit-Parallel MAC Processor using IMC
    Gajawada, Saketh
    Gupta, Aryan
    Prasad, Kailash
    Mekie, Joycee
    PROCEEDINGS OF THE 37TH INTERNATIONAL CONFERENCE ON VLSI DESIGN, VLSID 2024 AND 23RD INTERNATIONAL CONFERENCE ON EMBEDDED SYSTEMS, ES 2024, 2024, : 241 - 246
  • [44] A Low-Cost High Radix Floating-Point Square-Root Circuit
    Yang, Yuheng
    Yuan, Qing
    Liu, Jian
    ELECTRONICS, 2021, 10 (16)
  • [45] Optimizing Exponent Bias for Sub-8bit Floating-Point Inference of Fine-tuned Transformers
    Lee, Janghwan
    Choi, Jungwook
    2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, : 98 - 101
  • [46] Optimization of an Arithmetic Logic Unit using Generic Floating Point Algorithm for 12-Bit Architecture
    Anacan, Rommel M.
    PROCEEDINGS 5TH IEEE INTERNATIONAL CONFERENCE ON CONTROL SYSTEM, COMPUTING AND ENGINEERING (ICCSCE 2015), 2015, : 24 - 29
  • [47] Low-Cost Concurrent Error Detection for Floating-Point Unit (FPU) Controllers
    Maniatakos, Michail
    Kudva, Prabhakar
    Fleischer, Bruce M.
    Makris, Yiorgos
    IEEE TRANSACTIONS ON COMPUTERS, 2013, 62 (07) : 1376 - 1388
  • [48] A precision- and range-independent tool for testing floating-point arithmetic I: Basic operations, square root, and remainder
    Verdonk, B
    Cuyt, A
    Verschaeren, D
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2001, 27 (01): : 92 - 118
  • [49] Design and analysis of evolutionary bit-length optimization algorithms for floating to fixed-point conversion
    Rosa, L. S.
    Delbem, A. C. B.
    Toledo, C. F. M.
    Bonato, V.
    APPLIED SOFT COMPUTING, 2016, 49 : 447 - 461
  • [50] Low-Cost High-Precision Architecture for Arbitrary Floating-Point Nth Root Computation
    Hong, Wanyuan
    Chen, Hui
    Quan, Lianghua
    Fu, Yuxiang
    Li, Li
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,