Parallel Back-Propagation Neural Network Training Technique Using CUDA on Multiple GPUs

被引：0

作者：

Zhang, Shunlu ^{[1
]}

Gunupudi, Pavan ^{[1
]}

Zhang, Qi-Jun ^{[1
]}

机构：

[1] Carleton Univ, Dept Elect, Ottawa, ON K1B 5S6, Canada

来源：

2015 IEEE MTT-S INTERNATIONAL CONFERENCE ON NUMERICAL ELECTROMAGNETIC AND MULTIPHYSICS MODELING AND OPTIMIZATION (NEMO) | 2015年

关键词：

Neural Network; Back Propagation; CUDA; cuBLAS; GPGPU;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

A parallel Back-Propagation(BP) neural network training technique using Compute Unified Device Architecture (CUDA) on multiple Graphics Processing Units(GPUs) is proposed. To exploit the maximum performance of GPUs, we propose to implement batch mode BP training by building input neurons, hidden neurons and output neurons into matrix form. The implementation includes CUDA Basic Linear Algebra Subroutines (cuBLAS) function to perform matrix and vector operations and CUDA kernel. The proposed technique utilizes multiple GPUs to achieve further acceleration. Each GPU has the same neural network structure and weight parameter. The number of training samples are distributed to multiple GPUs. Each GPU calculates local training error and the gradient at each layer then transferred to the first GPU to calculate the summations. The summations are transferred back to each GPU to update the local weights until the training goal is achieved. A cavity microwave bandpass filter example is used to illustrate the validity of this technique.

引用

页数：3

共 6 条

[1]

Araújo MAA, 2003, IEEE C EVOL COMPUTAT, P1315

[2] Simple and Efficient High-Dimensional Parametric Modeling for Microwave Cavity Filters Using Modular Neural Network [J].

Cao, Yazi ;

Reitzinger, Stefan ;

Zhang, Qi-Jun .

IEEE MICROWAVE AND WIRELESS COMPONENTS LETTERS, 2011, 21 (05) :258-260

[3] PARALLEL IMPLEMENTATION OF ARTIFICIAL NEURAL NETWORK TRAINING [J].

Scanzio, Stefano ;

Cumani, Sandro ;

Gemello, Roberto ;

Mana, Franco ;

Laface, P. .

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, :4902-4905

[4]

Sierra-Canto X., 2010, 2010 Ninth International Conference on Machine Learning and Applications (ICMLA 2010), P307, DOI 10.1109/ICMLA.2010.52

[5] Neural network training algorithms on parallel architectures for finance applications [J].

Thulasiram, RK ;

Rahman, RM ;

Thulasiraman, P .

2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS, PROCEEDINGS, 2003, :236-243

[6]

Zhang Q.J., 2000, ARTECH MICR

← 1 →