The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio

被引:22
|
作者
Liang, Shan [1 ]
Liu, Wenju [1 ]
Jiang, Wei [1 ]
Xue, Wei [1 ]
机构
[1] Chinese Acad Sci, Inst Automat, NLPR, Beijing 100190, Peoples R China
来源
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2013年 / 134卷 / 05期
关键词
ENHANCEMENT; RECOGNITION; BINARY;
D O I
10.1121/1.4824632
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, a computational goal for a monaural speech separation system is proposed. Since this goal is derived by maximizing the signal-to-noise ratio (SNR), it is called the optimal ratio mask (ORM). Under the approximate W-Disjoint Orthogonality assumption which almost always holds due to the sparse nature of speech, theoretical analysis shows that the ORM can improve the SNR about 10log(10)2 dB over the ideal ratio mask. With three kinds of real-world interference, the speech separation results of SNR gain and objective quality evaluation demonstrate the correctness of the theoretical analysis, and imply that the ORMachieves a better separation performance. (C) 2013 Acoustical Society of America
引用
收藏
页码:EL452 / EL458
页数:7
相关论文
共 50 条
  • [1] Speech enhancement based on the modified phase using signal-to-noise ratio information and time-frequency characteristics
    Jia H.
    Wang W.
    Ji H.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (05): : 162 - 170
  • [2] Signal-to-noise ratio analysis in the joint time-frequency domain for ISAR imaging
    Wang, GY
    Xia, XG
    Chen, VC
    RADAR PROCESSING, TECHNOLOGY, AND APPLICATIONS III, 1998, 3462 : 164 - 173
  • [3] A TIME-FREQUENCY METHOD FOR INCREASING THE SIGNAL-TO-NOISE RATIO IN SYSTEM IDENTIFICATION WITH EXPONENTIAL SWEEPS
    Majdak, Piotr
    Balazs, Peter
    Kreuzer, Wolfgang
    Doerfler, Monika
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 3812 - 3815
  • [4] What visual perception model is optimal in terms of signal-to-noise ratio?
    Shelepin, YE
    Krasilnikov, NN
    Krasilnikova, OI
    Chihman, VN
    MEDICAL IMAGING 2000: IMAGE PERCEPTION AND PERFORMANCE, 2000, 3981 : 116 - 123
  • [5] SIGNAL-TO-NOISE RATIO IN A FREQUENCY MULTIPLEXING IMAGER
    BENYOSEF, N
    SIRAT, G
    IEEE JOURNAL OF QUANTUM ELECTRONICS, 1983, 19 (12) : 1741 - 1742
  • [6] Information Approach to Signal-to-Noise Ratio Estimation of the Speech Signal
    Gai, Vasiliy
    INFORMATION TECHNOLOGIES AND MATHEMATICAL MODELLING, 2014, 487 : 137 - 144
  • [7] 'Signal-to-Noise Ratio'
    McCooey, D
    POETRY REVIEW, 1999, 89 (01): : 73 - 73
  • [8] A formant frequency estimation algorithm for speech signals with low signal-to-noise ratio
    Fattah, S. A.
    Zhu, W. -P.
    Ahmad, M. O.
    2007 50TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-3, 2007, : 81 - 84
  • [9] THE MEASUREMENT OF THE SIGNAL-TO-NOISE RATIO (SNR) IN CONTINUOUS SPEECH
    KLINGHOLZ, F
    SPEECH COMMUNICATION, 1987, 6 (01) : 15 - 26
  • [10] SIGNAL-TO-NOISE RATIO AS A PREDICTOR OF SPEECH TRANSMISSION QUALITY
    SEN, TK
    CARROLL, JD
    IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1973, AU21 (04): : 384 - 387