Feature normalization based on non-extensive statistics for speech recognition

被引:17
|
作者
Pardede, Hilman F. [1 ]
Iwano, Koji [2 ]
Shinoda, Koichi [1 ]
机构
[1] Tokyo Inst Technol, Dept Comp Sci, Grad Sch Informat Sci & Engn, Meguro Ku, Tokyo 1528552, Japan
[2] Tokyo City Univ, Fac Environm & Informat Studies, Tsuzuki Ku, Yokohama, Kanagawa 2248551, Japan
关键词
Robust speech recognition; Normalization; q-Logarithm; Non-extensive statistics; CROSS-TERMS; NOISE; MODEL; ENHANCEMENT; ENVIRONMENT; SPECTRA; ALGEBRA;
D O I
10.1016/j.specom.2013.02.004
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Most compensation methods to improve the robustness of speech recognition systems in noisy environments such as spectral subtraction, CMN, and MVN, rely on the fact that noise and speech spectra are independent. However, the use of limited window in signal processing may introduce a cross-term between them, which deteriorates the speech recognition accuracy. To tackle this problem, we introduce the q-logarithmic (q-log) spectral domain of non-extensive statistics and propose q-log spectral mean normalization (q-LSMN) which is an extension of log spectral mean normalization (LSMN) to this domain. The recognition experiments on a synthesized noisy speech database, the Aurora-2 database, showed that q-LSMN was consistently better than the conventional normalization methods, CMN, LSMN, and MVN. Furthermore, q-LSMN was even more effective when applied to a real noisy environment in the CEN-SREC-2 database. It significantly outperformed ETSI AFE front-end. (C) 2013 Elsevier B.V. All rights reserved.
引用
收藏
页码:587 / 599
页数:13
相关论文
共 50 条
  • [21] Feature Normalization Using Structured Full Transforms for Robust Speech Recognition
    Xiao, Xiong
    Li, Jinyu
    Chng, Eng Siong
    Li, Haizhou
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 700 - +
  • [22] A non-extensive statistics of the fault-population at the Valles Marineris extensional province, Mars
    Vallianatos, Filippos
    Sammonds, Peter
    TECTONOPHYSICS, 2011, 509 (1-2) : 50 - 54
  • [23] Natural image segmentation with non-extensive mixture models
    Stosic, Dusan
    Stosic, Darko
    Ludermir, Teresa Bernarda
    Ren, Tsang Ing
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 63
  • [24] Non-extensive approach to quark matter
    Biro, T. S.
    Purcsel, G.
    Uermoessy, K.
    EUROPEAN PHYSICAL JOURNAL A, 2009, 40 (03) : 325 - 340
  • [25] Prediction of the morbidity and mortality rates of COVID-19 in Egypt using non-extensive statistics
    Yassin, Hayam
    Elyazeed, Eman R. Abo R.
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [26] Optimal investment problem under non-extensive statistical mechanics
    Liu, Limin
    Zhang, Lin
    Fan, Shiqi
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2018, 75 (10) : 3549 - 3557
  • [27] Non-extensive framework for earthquakes: The role of fragments
    Sotolongo-Costa, Oscar
    ACTA GEOPHYSICA, 2012, 60 (03) : 526 - 534
  • [28] Dust acoustic instability with non-extensive distribution
    Liu, San Qiu
    Qiu, Hui Bin
    JOURNAL OF PLASMA PHYSICS, 2013, 79 : 105 - 111
  • [29] A non-extensive thermodynamic theory of ecological systems
    Le Van Xuan
    Nguyen Khac Ngoc
    Nguyen Tri Lan
    Nguyen Ai Viet
    41ST VIETNAM NATIONAL CONFERENCE ON THEORETICAL PHYSICS, 2017, 865
  • [30] Irrelevant variability normalization based HMM training using map estimation of feature transforms for robust speech recognition
    Zhu, Donglai
    Huo, Qiang
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4717 - +