A Primer on Deep Learning Architectures and Applications in Speech Processing

被引:16
作者
Ogunfunmi, Tokunbo [1 ]
Ramachandran, Ravi Prakash [2 ]
Togneri, Roberto [3 ]
Zhao, Yuanjun [3 ]
Xia, Xianjun [3 ]
机构
[1] Santa Clara Univ, Dept Elect Engn, Santa Clara, CA 95053 USA
[2] Rowan Univ, Dept Elect & Comp Engn, Glassboro, NJ USA
[3] Univ Western Australia, Dept EEC Engn, Perth, WA, Australia
关键词
Deep learning; Signal processing; Discriminative algorithms; ROBUST SPEAKER RECOGNITION; NEURAL-NETWORKS; FRONT-END; ENHANCEMENT; MODEL;
D O I
10.1007/s00034-019-01157-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In the recent past years, deep-learning-based machine learning methods have demonstrated remarkable success for a wide range of learning tasks in multiple domains. They are suitable for complex classification and regression problems in applications such as computer vision, speech recognition and other pattern analysis branches. The purpose of this article is to contribute a timely review and introduction of state-of-the-art and popular discriminative DNN, CNN and RNN deep learning techniques, the basic framework and algorithms, hardware implementations, applications in speech, and the overall benefits of deep learning.
引用
收藏
页码:3406 / 3432
页数:27
相关论文
共 105 条
[1]   Convolutional Neural Networks for Speech Recognition [J].
Abdel-Hamid, Ossama ;
Mohamed, Abdel-Rahman ;
Jiang, Hui ;
Deng, Li ;
Penn, Gerald ;
Yu, Dong .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) :1533-1545
[2]  
Achkar A., 2017, P IEEE 30 CAN C EL C, P1
[3]  
[Anonymous], 1974, Ph.D. Thesis
[4]  
[Anonymous], P INT C DAT MIN DMIN
[5]  
[Anonymous], IEEE JETCAS SP UNPUB
[6]  
[Anonymous], 49 ANN IEEE ACM INT
[7]  
[Anonymous], IEEE WORKSH SIGN PRO
[8]  
[Anonymous], 2013, ARXIV13125663
[9]  
[Anonymous], PROC CVPR IEEE
[10]  
[Anonymous], 2015, Nature, DOI [10.1038/nature14539, DOI 10.1038/NATURE14539]