Single-channel multiple regression for in-car speech enhancement

被引:1
|
作者
Li, WF [1 ]
Itou, K
Takeda, K
Itakura, F
机构
[1] Nagoya Univ, Grad Sch Engn, Dept Informat Elect, Nagoya, Aichi 4648603, Japan
[2] Nagoya Univ, Grad Sch Engn, Dept Med Sci, Nagoya, Aichi 4648603, Japan
[3] Meijo Univ, Fac Sci & Technol, Nagoya, Aichi 4688502, Japan
关键词
speech enhancement; speech recognition; multi-layer perceptron; mean opinion score; pairwise preference test; environmental adaptation; K-means clustering;
D O I
10.1093/ietisy/e89-d.3.1032
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We address issues for improving hands-free speech enhancement and speech recognition performance in different car environments using a single distant microphone. This paper describes a new single-channel in-car speech enhancement method that estimates the log spectra of speech at a close-talking microphone based on the nonlinear regression of the log spectra of noisy signal captured by a distant microphone and the estimated noise. The proposed method provides significant overall quality improvements in our subjective evaluation on the regression-enhanced speech, and performed best in most objective measures. Based on our isolated word recognition experiments conducted under 15 real car environments, the proposed adaptive nonlinear regression approach shows an advantage in average relative word error rate (WER) reductions of 50.8% and 13.1%, respectively, compared to original noisy speech and ETSI advanced front-end (ETSI ES 202 050).
引用
收藏
页码:1032 / 1039
页数:8
相关论文
共 50 条
  • [1] Adaptive nonlinear regression using multiple distributed microphones for in-car speech recognition
    Li, WF
    Miyajima, C
    Nishino, T
    Itou, K
    Takeda, K
    Itakura, F
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2005, E88A (07) : 1716 - 1723
  • [2] Comparative studies on single-chanel de-noising schemes for in-car speech enhancement
    Li, Weifeng
    Itou, Katunobu
    Takeda, Kazuya
    Itakura, Fumitada
    ADVANCES FOR IN-VEHICLE AND MOBILE SYSTEMS: CHALLENGES FOR INTERNATIONAL STANDARDS, 2007, : 97 - 108
  • [3] Multiple regression of log spectra for in-car speech recognition using multiple distributed microphones
    Li, WF
    Shinde, T
    Fujimura, H
    Miyajima, C
    Nishino, T
    Itou, K
    Takeda, K
    Itakura, F
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (03) : 384 - 390
  • [4] Single-channel speech enhancement by subspace affinity minimization
    Tran, Dung N.
    Koishida, Kazuhito
    INTERSPEECH 2020, 2020, : 2447 - 2451
  • [5] UltraSE: Single-Channel Speech Enhancement Using Ultrasound
    Sun, Ke
    Zhang, Xinyu
    PROCEEDINGS OF THE 27TH ACM ANNUAL INTERNATIONAL CONFERENCE ON MOBILE COMPUTING AND NETWORKING (ACM MOBICOM '21), 2021, : 160 - 173
  • [6] A SUBBAND HYBRID BEAMFORMING FOR IN-CAR SPEECH ENHANCEMENT
    Fox, Charles
    Vitte, Guillaume
    Charbit, Maurice
    Prado, Jacques
    Badeau, Roland
    David, Bertrand
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 11 - 15
  • [7] Comparative Studies of Single-Channel Speech Enhancement Techniques
    Kumar, Bittu
    Kumar, Neeraj
    Kumar, Manoj
    Prasad, S. V. S.
    Varma, Ashwini Kumar
    Ravi, Banoth
    IETE JOURNAL OF RESEARCH, 2024, 70 (06) : 5704 - 5720
  • [8] Single-Channel Speech Enhancement Using Double Spectrum
    Blass, Martin
    Mowlaee, Pejman
    Kleijn, W. Bastiaan
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 1740 - 1744
  • [9] A spectral conversion approach to single-channel speech enhancement
    Mouchtaris, Athanasios
    Van der Spiegel, Jan
    Mueller, Paul
    Tsakalides, Panagiotis
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (04): : 1180 - 1193
  • [10] Adaptive log-spectral regression for in-car speech recognition using multiple distributed microphones
    Li, WF
    Takeda, K
    Itakura, F
    IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (04) : 340 - 343