INVESTIGATION OF DEEP BOLTZMANN MACHINES FOR PHONE RECOGNITION

被引:0
|
作者
You, Zhao [1 ]
Wang, Xiaorui [1 ]
Xu, Bo [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, Interact Digital Media Technol Res Ctr, Beijing, Peoples R China
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
关键词
phone recognition; acoustic modeling; Deep Boltzmann Machines; Deep Neural Networks;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In the past few years, deep neural networks (DNNs) achieved great successes in speech recognition. The layer-wise pre-trained deep belief network (DBN) is known as one of the critical factor to optimize the DNN. However, the DBN has one shortcoming that the pre-training procedure is in a greedy forward pass. The top-down influences on the inference process are ignored, thus the pre-trained DBN is suboptimal. In this paper, we attempt to apply deep Boltzmann machine (DBM) on acoustic modeling. DBM has the advantages that a top-down feedback is incorporated and the parameters of all layers can be jointly optimized. Experiments are conducted on the TIMIT phone recognition task to investigate the DBM-DNN acoustic model. Comparing with the DBN-DNN with same amount of parameters, phone error rate on the core test set is reduced by 3.8% relatively, and additional 5.1% by dropout fine-tuning.
引用
收藏
页码:7600 / 7603
页数:4
相关论文
共 50 条
  • [31] Ensemble of Gaussian Mixture Localized Neural Networks with Application to Phone Recognition
    Travadi, Ruchir
    Narayanan, Shrikanth
    16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, 2015, : 1903 - 1907
  • [32] Investigation of Deep Neural Networks for Robust Recognition of Nonlinearly Distorted Speech
    Seps, Ladislav
    Malek, Jiri
    Cerva, Petr
    Nouza, Jan
    15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 363 - 367
  • [33] Investigation of Full-Sequence Training of Deep Belief Networks for Speech Recognition
    Mohamed, Abdel-rahman
    Yu, Dong
    Deng, Li
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2850 - +
  • [34] Deep Appearance Models: A Deep Boltzmann Machine Approach for Face Modeling
    Chi Nhan Duong
    Khoa Luu
    Kha Gia Quach
    Bui, Tien D.
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2019, 127 (05) : 437 - 455
  • [35] Deep Appearance Models: A Deep Boltzmann Machine Approach for Face Modeling
    Chi Nhan Duong
    Khoa Luu
    Kha Gia Quach
    Tien D. Bui
    International Journal of Computer Vision, 2019, 127 : 437 - 455
  • [36] Learning Phone Recognition From Unpaired Audio and Phone Sequences Based on Generative Adversarial Network
    Liu, Da-rong
    Hsu, Po-chun
    Chen, Yi-chen
    Huang, Sung-feng
    Chuang, Shun-po
    Wu, Da-yi
    Lee, Hung-yi
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 230 - 243
  • [37] Improvement of Phone Recognition Accuracy Using Articulatory Features
    K. E. Manjunath
    K. Sreenivasa Rao
    Circuits, Systems, and Signal Processing, 2018, 37 : 704 - 728
  • [38] IMPROVING SPEECH RECOGNITION BY EXPLICIT MODELING OF PHONE DELETIONS
    Ko, Tom
    Mak, Brian
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4858 - 4861
  • [39] INVESTIGATION OF DEEP NEURAL NETWORKS (DNN) FOR LARGE VOCABULARY CONTINUOUS SPEECH RECOGNITION: WHY DNN SURPASSES GMMS IN ACOUSTIC MODELING
    Pan, Jia
    Liu, Cong
    Wang, Zhiguo
    Hu, Yu
    Jiang, Hui
    2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 301 - 305
  • [40] Development of Multilingual Phone Recognition System for Indian Languages
    Manjunath, K. E.
    Rao, K. Sreenivasa
    Jayagopi, Dinesh Babu
    2017 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, INFORMATICS, COMMUNICATION AND ENERGY SYSTEMS (SPICES), 2017,