PSO-based optimized CNN for Hindi ASR

被引:23
作者
Passricha, Vishal [1 ]
Aggarwal, Rajesh Kumar [1 ]
机构
[1] Natl Inst Technol, Kurukshetra, Haryana, India
关键词
CNN; Hyperparameter selection; PSO; Optimization; CONVOLUTIONAL NEURAL-NETWORKS; SPEECH;
D O I
10.1007/s10772-019-09652-3
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Convolutional Neural Network (CNN) is one of the successful deep learning algorithms that have shown its effectiveness in a variety of vision tasks. The performance of this network depends directly on its hyperparameters. Although, designing CNN architectures require expert knowledge of their intrinsic structure or a lot of trial and error. To overcome these issues, there is a need to automatically design the optimal architecture of CNNs without any human intervention. So, we try to eliminate the constraints on the number of convolutional layers and pooling layers and their type etc. from traditional architecture. Biologically inspired approaches have not been extensively exploited for this task. This paper attempts to automatically optimize CNN architecture's hyperparameters for speech recognition task based on particle swarm optimization (PSO) which is a population based stochastic optimization technique. The proposed method is evaluated by designing CNN architecture for speech recognition task on Hindi dataset. The experimental results show that the proposed method significantly designs the competitive CNN architecture which performs similar as other state-of-the-art methods.
引用
收藏
页码:1123 / 1133
页数:11
相关论文
共 40 条
[1]   Convolutional Neural Networks for Speech Recognition [J].
Abdel-Hamid, Ossama ;
Mohamed, Abdel-Rahman ;
Jiang, Hui ;
Deng, Li ;
Penn, Gerald ;
Yu, Dong .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2014, 22 (10) :1533-1545
[2]  
Baker B., 2017, INT C LEARNING REPRE
[3]   Gradient-based optimization of hyperparameters [J].
Bengio, Y .
NEURAL COMPUTATION, 2000, 12 (08) :1889-1900
[4]  
Bergstra J, 2011, ADV NEURAL INFORM PR, P2546, DOI 10.5555/2986459.2986743
[5]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[6]   WBSMDA: Within and Between Score for MiRNA-Disease Association prediction [J].
Chen, Xing ;
Yan, Chenggang Clarence ;
Zhang, Xu ;
You, Zhu-Hong ;
Deng, Lixi ;
Liu, Ying ;
Zhang, Yongdong ;
Dai, Qionghai .
SCIENTIFIC REPORTS, 2016, 6
[7]   On the Performance of Indirect Encoding Across the Continuum of Regularity [J].
Clune, Jeff ;
Stanley, Kenneth O. ;
Pennock, Robert T. ;
Ofria, Charles .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2011, 15 (03) :346-367
[8]  
Eberhart R, 1995, A new optimizer using particle swarm theory, P39, DOI [DOI 10.1109/MHS.1995.494215, 10.1109/mhs.1995.494215]
[9]  
FERNANDO C, 2016, P GEN EV COMP C 2016
[10]   Factored deep convolutional neural networks for noise robust speech recognition [J].
Fujimoto, Masakiyo .
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, :3837-3841