Reverberant Speech Recognition Based on Denoising Autoencoder

被引:0
作者
Ishii, Takaaki [1 ]
Komiyama, Hiroki [1 ]
Shinozaki, Takahiro [2 ]
Horiuchi, Yasuo [1 ]
Kuroiwa, Shingo [1 ]
机构
[1] Chiba Univ, Grad Sch Adv Integrat Sci, Div Informat Sci, Chiba, Japan
[2] Tokyo Inst Technol, Interdisciplinary Grad Sch Sci & Engn, Dept Informat Proc, Tokyo, Japan
来源
14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5 | 2013年
关键词
Denoising autoencoder; reverberant speech recognition; restricted Boltzmann machine; distant-talking speech recognition; CENSREC-4; REPRESENTATIONS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Denoising autoencoder is applied to reverberant speech recognition as a noise robust front-end to reconstruct clean speech spectrum from noisy input. In order to capture context effects of speech sounds, a window of multiple short-windowed spectral frames are concatenated to form a single input vector. Additionally, a combination of short and long-term spectra is investigated to properly handle long impulse response of reverberation while keeping necessary time resolution for speech recognition. Experiments are performed using the CENSREC-4 dataset that is designed as an evaluation framework for distant-talking speech recognition. Experimental results show that the proposed denoising autoencoder based front-end using the short-windowed spectra gives better results than conventional methods. By combining the long-term spectra, further improvement is obtained. The recognition accuracy by the proposed method using the short and long-term spectra is 97.0% for the open condition test set of the dataset, whereas it is 87.8% when a multi condition training based baseline is used. As a supplemental experiment, large vocabulary speech recognition is also performed and the effectiveness of the proposed method has been confirmed.
引用
收藏
页码:3479 / 3483
页数:5
相关论文
共 50 条
  • [31] EXPLORING DEEP NEURAL NETWORKS AND DEEP AUTOENCODERS IN REVERBERANT SPEECH RECOGNITION
    Mimura, Masato
    Sakai, Shinsuke
    Kawahara, Tatsuya
    2014 4TH JOINT WORKSHOP ON HANDS-FREE SPEECH COMMUNICATION AND MICROPHONE ARRAYS (HSCMA), 2014, : 197 - 201
  • [32] Linear Prediction-based Dereverberation with Very Deep Convolutional Neural Networks for Reverberant Speech Recognition
    Park, Sunchan
    Jeong, Yongwon
    Kim, Min Sik
    Kim, Hyung Soon
    2018 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2018, : 310 - 311
  • [33] Ensemble of jointly trained deep neural network-based acoustic models for reverberant speech recognition
    Lee, Moa
    Lee, Jeehye
    Chang, Joon-Hyuk
    DIGITAL SIGNAL PROCESSING, 2019, 85 : 1 - 9
  • [34] DEEP AUTOENCODERS AUGMENTED WITH PHONE-CLASS FEATURE FOR REVERBERANT SPEECH RECOGNITION
    Mimura, Masato
    Sakai, Shinsuke
    Kawahara, Tatsuya
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4365 - 4369
  • [35] Advanced Phishing Filter Using Autoencoder and Denoising Autoencoder
    Douzi, Samira
    Amar, Meryem
    El Ouahidi, Bouabid
    INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS (BDIOT 2017), 2017, : 125 - 129
  • [36] Prediction of Ocean Weather Based on Denoising AutoEncoder and Convolutional LSTM
    Kim, Ki-Su
    Lee, June-Beom
    Roh, Myung-Il
    Han, Ki-Min
    Lee, Gap-Heon
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2020, 8 (10) : 1 - 24
  • [37] Remote Sensing Image Classification Based on Stacked Denoising Autoencoder
    Liang, Peng
    Shi, Wenzhong
    Zhang, Xiaokang
    REMOTE SENSING, 2018, 10 (01):
  • [38] A Fast Convolutional Denoising Autoencoder based Fixtreme Learning Machine
    Sawaengchob, Janebhop
    Horata, Punyaphol
    Musikawan, Pakarat
    Kongsorot, Yanika
    2017 21ST INTERNATIONAL COMPUTER SCIENCE AND ENGINEERING CONFERENCE (ICSEC 2017), 2017, : 185 - 189
  • [39] Domain adaptation network based on hypergraph regularized denoising autoencoder
    Xuesong Wang
    Yuting Ma
    Yuhu Cheng
    Artificial Intelligence Review, 2019, 52 : 2061 - 2079
  • [40] Flood Disaster Assessment Method Based on a Stacked Denoising Autoencoder
    Chen, Yanping
    Wang, Yilun
    Wu, Zhize
    Zou, Le
    Li, Wenbo
    ELECTRONICS, 2023, 12 (18)