A Deep Learning prediction process accelerator based FPGA

被引:36
作者
Yu, Qi [1 ]
Wang, Chao [1 ]
Ma, Xiang [1 ]
Li, Xi [1 ]
Zhou, Xuehai [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci, Hefei, Peoples R China
来源
2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING | 2015年
关键词
FPGA; deep learning; prediction process; accelerator;
D O I
10.1109/CCGrid.2015.114
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, machine learning is widely used in applications and cloud services. And as the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems. To give users better experience, high performance implementations of deep learning applications seem very important. As a common means to accelerate algorithms, FPGA has high performance, low power consumption, small size and other characteristics. So we use FPGA to design a deep learning accelerator, the accelerator focuses on the implementation of the prediction process, data access optimization and pipeline structure. Compared with Core 2 CPU 2.3GHz, our accelerator can achieve promising result.
引用
收藏
页码:1159 / 1162
页数:4
相关论文
共 9 条
[1]  
[Anonymous], 2010, MOMENTUM
[2]  
Bengio Y., 2006, Advances in Neural Information Processing Systems, V19, P153
[3]  
Bengio Y, 2011, LECT NOTES ARTIF INT, V6925, P18, DOI 10.1007/978-3-642-24412-4_3
[4]   DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning [J].
Chen, Tianshi ;
Du, Zidong ;
Sun, Ninghui ;
Wang, Jia ;
Wu, Chengyong ;
Chen, Yunji ;
Temam, Olivier .
ACM SIGPLAN NOTICES, 2014, 49 (04) :269-283
[5]  
Farabet C., 2011, Machine Learning on Very Large Data Sets
[6]   A fast learning algorithm for deep belief nets [J].
Hinton, Geoffrey E. ;
Osindero, Simon ;
Teh, Yee-Whye .
NEURAL COMPUTATION, 2006, 18 (07) :1527-1554
[7]   A Large-scale Architecture for Restricted Boltzmann Machines [J].
Kim, Sang Kyun ;
McMahon, Peter L. ;
Olukotun, Kunle .
2010 18TH IEEE ANNUAL INTERNATIONAL SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES (FCCM 2010), 2010, :201-208
[8]   High-Performance Reconfigurable Hardware Architecture for Restricted Boltzmann Machines [J].
Le Ly, Daniel ;
Chow, Paul .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 2010, 21 (11) :1780-1792
[9]  
Wang C-L., 2014, P IEEE C EXP TRANSP, P1, DOI DOI 10.1109/TNET.2014.2357498