Improved feature processing for Deep Neural Networks

被引:0
|
作者
Rath, Shakti P. [1 ,2 ]
Povey, Daniel [3 ]
Vesely, Karel [1 ]
Cernocky, Jan [1 ]
机构
[1] Brno Univ Technol, Speech FIT, Bozetechova 2, Brno, Czech Republic
[2] Univ Cambridge, Dept Engn, Cambridge, England
[3] Johns Hopkins Univ, Ctr Language & Speech Proc, Baltimore, MD 21218 USA
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we investigate alternative ways of processing MFCC-based features to use as the input to Deep Neural Networks (DNNs). Our baseline is a conventional feature pipeline that involves splicing the 13-dimensional front-end MFCCs across 9 frames, followed by applying LDA to reduce the dimension to 40 and then further decorrelation using MLLT. Confirming the results of other groups, we show that speaker adaptation applied on the top of these features using feature-space MLLR is helpful. The fact that the number of parameters of a DNN is not strongly sensitive to the input feature dimension (unlike GMM-based systems) motivated us to investigate ways to increase the dimension of the features. In this paper, we investigate several approaches to derive higher-dimensional features and verify their performance with DNN. Our best result is obtained from splicing our baseline 40-dimensional speaker adapted features again across 9 frames, followed by reducing the dimension to 200 or 300 using another LDA. Our final result is about 3% absolute better than our best GMM system, which is a discriminatively trained model.
引用
收藏
页码:109 / 113
页数:5
相关论文
共 50 条
  • [1] IMPROVED MUSIC FEATURE LEARNING WITH DEEP NEURAL NETWORKS
    Sigtia, Siddharth
    Dixon, Simon
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [2] Natural Language Processing with Improved Deep Learning Neural Networks
    Zhou, YiTao
    SCIENTIFIC PROGRAMMING, 2022, 2022
  • [3] Natural Language Processing with Improved Deep Learning Neural Networks
    Zhou, Yitao
    Scientific Programming, 2022, 2022
  • [4] Feature data processing: Making medical data fit deep neural networks
    Xu, Yingying
    Liu, Zhi
    Li, Yujun
    Hou, Haixia
    Cao, Yankun
    Zhao, Yuefeng
    Guo, Wei
    Cui, Lizhen
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2020, 109 : 149 - 157
  • [5] Improved feature screening in feedforward neural networks
    Steppe, JM
    Bauer, KW
    NEUROCOMPUTING, 1996, 13 (01) : 47 - 58
  • [6] Discriminative Feature Extraction with Deep Neural Networks
    Stuhlsatz, Andre
    Lippel, Jens
    Zielke, Thomas
    2010 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS IJCNN 2010, 2010,
  • [7] Feature Selection using Deep Neural Networks
    Roy, Debaditya
    Murty, K. Sri Rama
    Mohan, C. Krishna
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [8] Architecture of neural processing unit for deep neural networks
    Lee, Kyuho J.
    HARDWARE ACCELERATOR SYSTEMS FOR ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, 2021, 122 : 217 - 245
  • [9] Watermarking Deep Neural Networks in Image Processing
    Quan, Yuhui
    Teng, Huan
    Chen, Yixin
    Ji, Hui
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (05) : 1852 - 1865
  • [10] Improved Acoustic Feature Combination for LVCSR by Neural Networks
    Plahl, Christian
    Schlueter, Ralf
    Ney, Hermann
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1244 - 1247