Deep learning accelerators: a case study with MAESTRO

被引:5
作者
Bolhasani, Hamidreza [1 ]
Jassbi, Somayyeh Jafarali [1 ]
机构
[1] Islamic Azad Univ, Sci & Res Branch, Dept Comp Engn, Tehran, Iran
关键词
Deep learning; Convolutional neural networks; Deep neural networks; Hardware accelerator; Deep learning accelerator;
D O I
10.1186/s40537-020-00377-8
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In recent years, deep learning has become one of the most important topics in computer sciences. Deep learning is a growing trend in the edge of technology and its applications are now seen in many aspects of our life such as object detection, speech recognition, natural language processing, etc. Currently, almost all major sciences and technologies are benefiting from the advantages of deep learning such as high accuracy, speed and flexibility. Therefore, any efforts in improving performance of related techniques is valuable. Deep learning accelerators are considered as hardware architecture, which are designed and optimized for increasing speed, efficiency and accuracy of computers that are running deep learning algorithms. In this paper, after reviewing some backgrounds on deep learning, a well-known accelerator architecture named MAERI (Multiply-Accumulate Engine with Reconfigurable interconnects) is investigated. Performance of a deep learning task is measured and compared in two different data flow strategies: NLR (No Local Reuse) and NVDLA (NVIDIA Deep Learning Accelerator), using an open source tool called MAESTRO (Modeling Accelerator Efficiency via Spatio-Temporal Resource Occupancy). Measured performance indicators of novel optimized architecture, NVDLA shows higher L1 and L2 computation reuse, and lower total runtime (cycles) in comparison to the other one.
引用
收藏
页数:11
相关论文
共 50 条
[21]   A Fast Design Space Exploration Framework for the Deep Learning Accelerators: Work-in-Progress [J].
Colucci, Alessio ;
Marchisio, Alberto ;
Bussolino, Beatrice ;
Mrazek, Voitech ;
Martina, Maurizio ;
Masera, Guido ;
Shafique, Muhammad .
PROCEEDINGS OF THE 2020 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS), 2019, :34-36
[22]   A Case Study Applying Mesoscience to Deep Learning [J].
Guo, Li ;
Meng, Fanyong ;
Qin, Pengfei ;
Xia, Zhaojie ;
Chang, Qi ;
Chen, Jianhua ;
Li, Jinghai .
ENGINEERING, 2024, 39 :84-93
[23]   Decomposable Architecture and Fault Mitigation Methodology for Deep Learning Accelerators [J].
Huang, Ning-Chi ;
Yang, Min-Syue ;
Chang, Ya-Chu ;
Wu, Kai-Chiang .
2023 24TH INTERNATIONAL SYMPOSIUM ON QUALITY ELECTRONIC DESIGN, ISQED, 2023, :298-305
[24]   A Survey and Taxonomy of FPGA-based Deep Learning Accelerators [J].
Blaiech, Ahmed Ghazi ;
Ben Khalifa, Khaled ;
Valderrama, Carlos ;
Fernandes, Marcelo A. C. ;
Bedoui, Mohamed Hedi .
JOURNAL OF SYSTEMS ARCHITECTURE, 2019, 98 :331-345
[25]   Analyzing and Mitigating Circuit Aging Effects in Deep Learning Accelerators [J].
Das, Sanjay ;
Kundu, Shamik ;
Menon, Anand ;
Ren, Yihui ;
Kharel, Shubha ;
Basu, Kanad .
2024 IEEE 42ND VLSI TEST SYMPOSIUM, VTS 2024, 2024,
[26]   A Comprehensive Evaluation of Novel AI Accelerators for Deep Learning Workloads [J].
Emani, Murali ;
Xie, Zhen ;
Raskar, Siddhisanket ;
Sastry, Varuni ;
Arnold, William ;
Wilson, Bruce ;
Thakur, Rajeev ;
Vishwanath, Venkatram ;
Liu, Zhengchun ;
Papka, Michael E. ;
Bohorquez, Cindy Orozco ;
Weisner, Rick ;
Li, Karen ;
Sheng, Yongning ;
Du, Yun ;
Zhang, Jian ;
Tsyplikhin, Alexander ;
Khaira, Gurdaman ;
Fowers, Jeremy ;
Sivakumar, Ramakrishnan ;
Godsoe, Victoria ;
Macias, Adrian ;
Tekur, Chetan ;
Boyd, Matthew .
2022 IEEE/ACM INTERNATIONAL WORKSHOP ON PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS), 2022, :13-25
[27]   Deep Learning at Scale on NVIDIA V100 Accelerators [J].
Xu, Rengan ;
Han, Frank ;
Ta, Quy .
PROCEEDINGS OF 2018 IEEE/ACM PERFORMANCE MODELING, BENCHMARKING AND SIMULATION OF HIGH PERFORMANCE COMPUTER SYSTEMS (PMBS 2018), 2018, :23-32
[28]   Kernel Mapping Techniques for Deep Learning Neural Network Accelerators [J].
Ozdemir, Sarp ;
Khasawneh, Mohammad ;
Rao, Smriti ;
Madden, Patrick H. .
ISPD'22: PROCEEDINGS OF THE 2022 INTERNATIONAL SYMPOSIUM ON PHYSICAL DESIGN, 2022, :21-28
[29]   Automatic Bird Identification for Offshore Wind Farms: A Case Study for Deep Learning [J].
Niemi, Juha ;
Tanttu, Juha T. .
PROCEEDINGS OF 2017 INTERNATIONAL SYMPOSIUM ELMAR, 2017, :263-266
[30]   Challenges and practices of deep learning model reengineering: A case study on computer vision [J].
Jiang, Wenxin ;
Banna, Vishnu ;
Vivek, Naveen ;
Goel, Abhinav ;
Synovic, Nicholas ;
Thiruvathukal, George K. ;
Davis, James C. .
EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (06)