Hardware-Software Co-Design for an Analog-Digital Accelerator for Machine Learning

被引:0
作者
Ambrosi, Joao [1 ]
Ankit, Aayush [2 ]
Antunes, Rodrigo [1 ]
Chalamalasetti, Sai Rahul [1 ]
Chatterjee, Soumitra [1 ]
El Hajj, Izzat [3 ]
Fachini, Guilherme [1 ]
Faraboschi, Paolo [1 ]
Foltin, Martin [1 ]
Huang, Sitao [3 ]
Hwu, Wen-mei [3 ]
Knuppe, Gustavo [1 ]
Lakshminarasimha, Sunil Vishwanathpur [1 ]
Milojicic, Dejan [1 ]
Parthasarathy, Mohan [1 ]
Ribeiro, Filipe [1 ]
Rosa, Lucas [1 ]
Roy, Kaushik [2 ]
Silveira, Plinio [1 ]
Strachan, John Paul [1 ]
机构
[1] Hewlett Packard Enterprise, 1500 Page Mill Rd, Palo Alto, CA 94304 USA
[2] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[3] Univ Illinois, Urbana, IL 61801 USA
来源
2018 IEEE INTERNATIONAL CONFERENCE ON REBOOTING COMPUTING (ICRC) | 2018年
关键词
COPROCESSOR; MEMORY;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The increasing deployment of machine learning at the core and at the edge for applications such as video and image recognition has resulted in a number of special purpose accelerators in this domain. However, these accelerators do not have full end-to-end software stacks for application development, resulting in hard-to-develop, proprietary, and suboptimal application programming and executables. In this paper, we describe software stack for a memristor-based hybrid (analog-digital) accelerator. The software stack consists of an ONNX converter, an application optimizer, a compiler, a driver, and emulators. The ONNX converter helps leveraging interoperable neural network models developed on frameworks that support ONNX, such as CNTK, Caffe2, Tensorflow, etc. The application optimization layer adapts these interoperable models to the underlying hardware. The compiler generates executable ISA code that the underlying accelerator can run. Finally, the emulator enables software execution without actual hardware which enables hardware design space exploration and testing. By building a software stack, we have made hybrid memristor-based ML accelerators more accessible to software developers, enabled a generation of better-performing executables, and created an environment that can be leveraged by a multitude of existing neural network models developed using other frameworks to target these accelerators.
引用
收藏
页码:141 / 153
页数:13
相关论文
共 110 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   Bit-Pragmatic Deep Neural Network Computing [J].
Albericio, Jorge ;
Delmas, Alberto ;
Judd, Patrick ;
Sharify, Sayeh ;
O'Leary, Gerard ;
Genov, Roman ;
Moshovos, Andreas .
50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, :382-394
[3]   Cnvlutin: Ineffectual-Neuron-Free Deep Neural Network Computing [J].
Albericio, Jorge ;
Judd, Patrick ;
Hetherington, Tayler ;
Aamodt, Tor ;
Jerger, Natalie Enright ;
Moshovos, Andreas .
2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, :1-13
[4]   Pattern classification by memristive crossbar circuits using ex situ and in situ training [J].
Alibart, Fabien ;
Zamanidoost, Elham ;
Strukov, Dmitri B. .
NATURE COMMUNICATIONS, 2013, 4
[5]  
Andri R., 2017, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems PP, P1
[6]   RESPARC: A Reconfigurable and Energy-Efficient Architecture with Memristive Crossbars for Deep Spiking Neural Networks [J].
Ankit, Aayush ;
Sengupta, Abhronil ;
Panda, Priyadarshini ;
Roy, Kaushik .
PROCEEDINGS OF THE 2017 54TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2017,
[7]  
[Anonymous], 2017, NATURE NANOTECHNOLOG
[8]  
[Anonymous], P 6 C S OP SYST DES
[9]  
[Anonymous], IRE T ELECT COMPUTER
[10]  
[Anonymous], 2015, CORR