Accurate Low-Bit Length Floating-Point Arithmetic with Sorting Numbers

被引:0
|
作者
Alireza Dehghanpour
Javad Khodamoradi Kordestani
Masoud Dehyadegari
机构
[1] K. N. Toosi University of Technology,Faculty of Computer Engineering
[2] Institute for Research in Fundamental Sciences (IPM),School of Computer Science
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Deep neural networks; Floating point; Sorting; AlexNet; Convolutional neural networks;
D O I
暂无
中图分类号
学科分类号
摘要
A 32-bit floating-point format is often used for the development and training of deep neural networks. Training and inference in deep learning-optimized codecs can result in enormous performance and energy efficiency advantages. However, training and inferring low-bit neural networks still pose a significant challenge. In this study, we propose a sorting method that maintains accuracy in numerical formats with a low number of bits. We tested this method on convolutional neural networks, including AlexNet. Using our method, we found that in our convolutional neural network, the accuracy achieved with 11 bits matches that of the IEEE 32-bit format. Similarly, in AlexNet, the accuracy achieved with 10 bits matches that of the IEEE 32-bit format. These results suggest that the sorting method shows promise for calculations with limited accuracy.
引用
收藏
页码:12061 / 12078
页数:17
相关论文
共 50 条
  • [21] WHAT EVERY COMPUTER SCIENTIST SHOULD KNOW ABOUT FLOATING-POINT ARITHMETIC
    GOLDBERG, D
    COMPUTING SURVEYS, 1991, 23 (01) : 5 - 48
  • [22] A Monte-Carlo Floating-Point Unit for Self-Validating Arithmetic
    Yeung, Jackson H. C.
    Young, Evangeline F. Y.
    Leong, Philip H. W.
    FPGA 11: PROCEEDINGS OF THE 2011 ACM/SIGDA INTERNATIONAL SYMPOSIUM ON FIELD PROGRAMMABLE GATE ARRAYS, 2011, : 199 - 207
  • [23] Rapid application specific floating-point unit generation with bit-alignment
    Chong, Yee Jern
    Pararneswaran, Sri
    2008 45TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, VOLS 1 AND 2, 2008, : 62 - 67
  • [24] Hardware Implementation of 24-bit Vedic Multiplier in 32-bit Floating-Point Divider
    Hanuman, C. R. S.
    Kamala, J.
    2018 4TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS AND SYSTEM ENGINEERING (ICEESE), 2018, : 60 - 64
  • [25] Secure computation of hidden Markov models and secure floating-point arithmetic in the malicious model
    Aliasgari, Mehrdad
    Blanton, Marina
    Bayatbabolghani, Fattaneh
    INTERNATIONAL JOURNAL OF INFORMATION SECURITY, 2017, 16 (06) : 577 - 601
  • [26] A Modular-Positional Computation Technique for Multiple-Precision Floating-Point Arithmetic
    Isupov, Konstantin
    Knyazkov, Vladimir
    PARALLEL COMPUTING TECHNOLOGIES (PACT 2015), 2015, 9251 : 47 - 61
  • [27] ASIC Design of Nanoscale Artificial Neural Networks for Inference/Training by Floating-Point Arithmetic
    Niknia, Farzad
    Wang, Ziheng
    Liu, Shanshan
    Reviriego, Pedro
    Louri, Ahmed
    Lombardi, Fabrizio
    IEEE TRANSACTIONS ON NANOTECHNOLOGY, 2024, 23 : 208 - 216
  • [28] A Novel Low Power and High Speed Multiply-Accumulate (MAC) Unit Design for Floating-Point Numbers
    Babu, N. Jithendra
    Sarma, Rajkumar
    2015 INTERNATIONAL CONFERENCE ON SMART TECHNOLOGIES AND MANAGEMENT FOR COMPUTING, COMMUNICATION, CONTROLS, ENERGY AND MATERIALS (ICSTM), 2015, : 411 - 417
  • [29] A New Architecture for Accurate Dot Product of Floating Point Numbers
    Zaki, Ahmad M.
    Eldin, Ayman M. Bahaa
    El-Shafey, Mohamed H.
    Ali, Gamal M.
    ICCES'2010: THE 2010 INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS, 2010, : 139 - 145
  • [30] Secure computation of hidden Markov models and secure floating-point arithmetic in the malicious model
    Mehrdad Aliasgari
    Marina Blanton
    Fattaneh Bayatbabolghani
    International Journal of Information Security, 2017, 16 : 577 - 601