Improved ETSI Advanced Front-End for ASR Based on Robust Complex Speech Analysis

被引:0
作者
Higa, Keita [1 ]
Funaki, Keiichi [2 ]
机构
[1] Univ Ryukyus, Grad Sch Sci & Engn, Nishihara, Okinawa 90301, Japan
[2] Univ Ryukyus, C&N Ctr, Nishihara, Okinawa 90301, Japan
来源
2016 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA) | 2016年
关键词
robust ASR; AFE; TV-CAR; ELS;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
An automatic speech recognition (ASR) is commonly used in these days. Current ASR systems perform well in ideal environment, however it does not perform well in realistic noisy environment. As a robust ASR, ETSI has standardized Advanced Front-End (AFE) that adopts two-stage of iterative Wiener filter (IWF) to realize a speech enhancement as the front-end of ASR. In the ETSI AFE, FFT is used to estimate speech spectrum that designs the Wiener filter. On the other hand, we have already proposed robust complex speech analysis for an analytic signal. It can estimate more robust and more accurate speech spectrum due to the introduced robust criterion and nature of analytic signal. This paper proposes an improved AFE using wide-band robust ELS (Extended Least Square) complex analysis and real-valued analysis instead of FFT. The experimental results using the CENSREC-2 speech database demonstrates that the performance is improved.
引用
收藏
页数:4
相关论文
共 22 条
  • [1] [Anonymous], 2007, 202050 ETSI ES
  • [2] SUPPRESSION OF ACOUSTIC NOISE IN SPEECH USING SPECTRAL SUBTRACTION
    BOLL, SF
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1979, 27 (02): : 113 - 120
  • [3] SPEECH ENHANCEMENT USING A MINIMUM MEAN-SQUARE ERROR LOG-SPECTRAL AMPLITUDE ESTIMATOR
    EPHRAIM, Y
    MALAH, D
    [J]. IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1985, 33 (02): : 443 - 445
  • [4] Funaki K., 2008, EUSIPCO 2008
  • [5] Funaki K, 2001, P EUROSPEECH2001 AAL
  • [6] CONSTRAINED ITERATIVE SPEECH ENHANCEMENT WITH APPLICATION TO SPEECH RECOGNITION
    HANSEN, JHL
    CLEMENTS, MA
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 1991, 39 (04) : 795 - 805
  • [7] PERCEPTUAL LINEAR PREDICTIVE (PLP) ANALYSIS OF SPEECH
    HERMANSKY, H
    [J]. JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1990, 87 (04) : 1738 - 1752
  • [8] RASTA Processing of Speech
    Hermansky, Hynek
    Morgan, Nelson
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1994, 2 (04): : 578 - 589
  • [9] HIGA K., 2015, IEICE T A, VE98-A
  • [10] Hirsch H.G, 2000, P ASR2000 AUT SPEECH