EXPLOITING SYNCHRONY SPECTRA AND DEEP NEURAL NETWORKS FOR NOISE-ROBUST AUTOMATIC SPEECH RECOGNITION

被引:0
|
作者
Ma, Ning [1 ]
Marxer, Ricard [1 ]
Barker, Jon [1 ]
Brown, Guy J. [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Sheffield S1 4DP, S Yorkshire, England
关键词
Deep neural network; noise-robust automatic speech recognition; synchrony spectra; mask estimation;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents a novel system that exploits synchrony spectra and deep neural networks (DNNs) for automatic speech recognition (ASR) in challenging noisy environments. Synchrony spectra measure the extent to which each frequency channel in an auditory model is entrained to a particular pitch period, and they are used together with F0 estimates either in a DNN for time-frequency (T-M) mask estimation or to augment the input features for a DNN-based ASR system. The proposed approach was evaluated in the context of the CHiME 3 Challenge. Our experiments show that the synchrony spectra features work best when augmenting the input features to the DNN-based ASR system. Compared to the CHiME-3 baseline system, our best system provides a word error rate (WER) reduction of more than 14% absolute and achieved a WER of 18.56% on the evaluation test set.
引用
收藏
页码:490 / 495
页数:6
相关论文
共 50 条
  • [21] Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition
    Qian, Yanmin
    Bi, Mengxiao
    Tan, Tian
    Yu, Kai
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (12) : 2263 - 2276
  • [22] Novel frequency masking curves for noise-robust automatic speech recognition
    Chen, Chia-Ping
    Yeh, Ja-Zang
    Wu, Bo-Feng
    JOURNAL OF THE CHINESE INSTITUTE OF ENGINEERS, 2013, 36 (06) : 696 - 703
  • [23] Sparse coding of the modulation spectrum for noise-robust automatic speech recognition
    Ahmadi, Sara
    Ahadi, Seyed Mohammad
    Cranen, Bert
    Boves, Lou
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014, : 1 - 20
  • [24] A Noise-Robust Speech Recognition System Based on Wavelet Neural Network
    Wang, Yiping
    Zhao, Zhefeng
    ARTIFICIAL INTELLIGENCE AND COMPUTATIONAL INTELLIGENCE, PT III, 2011, 7004 : 392 - 397
  • [25] Noise-Robust speech recognition of Conversational Telephone Speech
    Chen, Gang
    Tolba, Hesham
    O'Shaughnessy, Douglas
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1101 - 1104
  • [26] Noise-Robust Deep Spiking Neural Networks with Temporal Information
    Park, Seongsik
    Lee, Dongjin
    Yoon, Sungroh
    2021 58TH ACM/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2021, : 373 - 378
  • [27] Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System using Deep Recurrent Neural Networks
    Valentini-Botinhao, Cassia
    Wang, Xin
    Takaki, Shinji
    Yamagishi, Junichi
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 352 - 356
  • [28] Noise-robust automatic speech recognition using a predictive echo state network
    Skowronski, Mark D.
    Harris, John G.
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (05): : 1724 - 1730
  • [29] EXPLOITING LONG-RANGE TEMPORAL DYNAMICS OF SPEECH FOR NOISE-ROBUST SPEAKER RECOGNITION
    Jafari, Ayeh
    Srinivasan, Ramji
    Crookes, Danny
    Ming, Ji
    19TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2011), 2011, : 2123 - 2127
  • [30] A Novel Model Characteristics for Noise-Robust Automatic Speech Recognition Based on HMM
    Rafieee, M. Saadeq
    Khazaei, Ali Akbar
    2010 IEEE INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND INFORMATION SECURITY (WCNIS), VOL 2, 2010, : 215 - 218