Efficient GPU implementation of convolutional neural networks for speech recognition

被引：0

作者：

van den Berg, Ewout ^{[1
]}

Brand, Daniel ^{[2
]}

Bordawekar, Rajesh ^{[2
]}

Rachevsky, Leonid ^{[1
]}

Ramabhadran, Bhuvana ^{[1
]}

机构：

[1] IBM Watson, Yorktown Hts, NY 10598 USA

[2] IBM TJ Watson Res Ctr, Yorktown Hts, NY USA

来源：

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年

关键词：

Acoustic modeling; convolutional neural networks; GPU acceleration; cuBLAS; cuDNN; cuFFT;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Deep learning has enjoyed tremendous success in speech recognition in recent years. Despite their widespread use, training large and complex architectures remains very time consuming. A prime example of this are convolutional neural networks (CNNs), which have provided state-of-the-art results, but are also among the most computationally intensive networks to train. In this paper, we study four different methods for GPU acceleration of CNNs: a native implementation using cuBLAS, two implementations based on NVIDIA's recently released deep-learning cuDNN library, and an implementation based on cuFFT. We analyze the performance of each of these approaches on the forward operation, the gradient computation, and the backward propagation. The overall best performance is obtained using the custom native implementation, which was found to be up to 6.9 times faster than cuDNN. The paper concludes with results on the end-to-end training speed of our CNN network on an LVCSR task.

引用

页码：1483 / 1487

页数：5

共 50 条

[1] Convolutional Neural Networks for Speech Recognition
Abdel-Hamid, Ossama
Mohamed, Abdel-Rahman
Jiang, Hui
Deng, Li
Penn, Gerald
Yu, Dong
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) : 1533 - 1545
[2] Implementation of Convolutional Neural Network for Speech Recognition
Wang, Zhichao
Na, Xingyu
Liu, Yong
Pan, Jielin
Yan, Yonghong
INTERNATIONAL ACADEMIC CONFERENCE ON THE INFORMATION SCIENCE AND COMMUNICATION ENGINEERING (ISCE 2014), 2014, : 239 - 243
[3] Continuous speech recognition by convolutional neural networks
Zhang, Qing-Qing
Liu, Yong
Pan, Jie-Lin
Yan, Yong-Hong
Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2015, 37 (09): : 1212 - 1217
[4] Convolutional Neural Networks for Distant Speech Recognition
Swietojanski, Pawel
Ghoshal, Arnab
Renals, Steve
IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1120 - 1124
[5] AN ANALYSIS OF CONVOLUTIONAL NEURAL NETWORKS FOR SPEECH RECOGNITION
Huang, Jui-Ting
Li, Jinyu
Gong, Yifan
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4989 - 4993
[6] Speech Recognition Based on Convolutional Neural Networks
Du Guiming
Wang Xia
Wang Guangyan
Zhang Yan
Li Dan
2016 IEEE INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2016, : 708 - 711
[7] Efficient Implementation of Convolutional Neural Networks on FPGA
Hadnagy, A.
Feher, B.
Kovacshazy, T.
2018 19TH INTERNATIONAL CARPATHIAN CONTROL CONFERENCE (ICCC), 2018, : 359 - 364
[8] Speech recognition in noisy environments with Convolutional Neural Networks
Santos, Rafael M.
Matos, Leonardo N.
Macedo, Hendrik T.
Montalvao, Jugurta
2015 BRAZILIAN CONFERENCE ON INTELLIGENT SYSTEMS (BRACIS 2015), 2015, : 175 - 179
[9] Continuous Speech Emotion Recognition with Convolutional Neural Networks
Vryzas, Nikolaos
Vrysis, Lazaros
Matsiola, Maria
Kotsakis, Rigas
Dimoulas, Charalampos
Kalliris, George
JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2020, 68 (1-2): : 14 - 24
[10] Continuous speech emotion recognition with convolutional neural networks
Vryzas, Nikolaos
Vrysis, Lazaros
Matsiola, Maria
Kotsakis, Rigas
Dimoulas, Charalampos
Kalliris, George
AES: Journal of the Audio Engineering Society, 2020, 68 (1-2): : 14 - 24

← 1 2 3 4 5 →