Improved feature processing for Deep Neural Networks

被引：0

作者：

Rath, Shakti P. ^{[1
,2
]}

Povey, Daniel ^{[3
]}

Vesely, Karel ^{[1
]}

Cernocky, Jan ^{[1
]}

机构：

[1] Brno Univ Technol, Speech FIT, Bozetechova 2, Brno, Czech Republic

[2] Univ Cambridge, Dept Engn, Cambridge, England

[3] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA

来源：

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we investigate alternative ways of processing MFCC-based features to use as the input to Deep Neural Networks (DNNs). Our baseline is a conventional feature pipeline that involves splicing the 13-dimensional front-end MFCCs across 9 frames, followed by applying LDA to reduce the dimension to 40 and then further decorrelation using MLLT. Confirming the results of other groups, we show that speaker adaptation applied on the top of these features using feature-space MLLR is helpful. The fact that the number of parameters of a DNN is not strongly sensitive to the input feature dimension (unlike GMM-based systems) motivated us to investigate ways to increase the dimension of the features. In this paper, we investigate several approaches to derive higher-dimensional features and verify their performance with DNN. Our best result is obtained from splicing our baseline 40-dimensional speaker adapted features again across 9 frames, followed by reducing the dimension to 200 or 300 using another LDA. Our final result is about 3% absolute better than our best GMM system, which is a discriminatively trained model.

引用

页码：109 / 113

页数：5

共 50 条

[1] IMPROVED MUSIC FEATURE LEARNING WITH DEEP NEURAL NETWORKS
Sigtia, Siddharth
Dixon, Simon
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
[2] Natural Language Processing with Improved Deep Learning Neural Networks
Zhou, YiTao
SCIENTIFIC PROGRAMMING, 2022, 2022
[3] Natural Language Processing with Improved Deep Learning Neural Networks
Zhou, Yitao
Scientific Programming, 2022, 2022
[4] Feature data processing: Making medical data fit deep neural networks
Xu, Yingying
Liu, Zhi
Li, Yujun
Hou, Haixia
Cao, Yankun
Zhao, Yuefeng
Guo, Wei
Cui, Lizhen
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 109 : 149 - 157
[5] Improved feature screening in feedforward neural networks
Steppe, JM
Bauer, KW
NEUROCOMPUTING, 1996, 13 (01) : 47 - 58
[6] Discriminative Feature Extraction with Deep Neural Networks
Stuhlsatz, Andre
Lippel, Jens
Zielke, Thomas
2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
[7] Feature Selection using Deep Neural Networks
Roy, Debaditya
Murty, K. Sri Rama
Mohan, C. Krishna
2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
[8] Architecture of neural processing unit for deep neural networks
Lee, Kyuho J.
HARDWARE ACCELERATOR SYSTEMS FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2021, 122 : 217 - 245
[9] Watermarking Deep Neural Networks in Image Processing
Quan, Yuhui
Teng, Huan
Chen, Yixin
Ji, Hui
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (05) : 1852 - 1865
[10] Improved Acoustic Feature Combination for LVCSR by Neural Networks
Plahl, Christian
Schlueter, Ralf
Ney, Hermann
12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1244 - 1247

← 1 2 3 4 5 →