The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio

被引：22

作者：

Liang, Shan ^{[1
]}

Liu, Wenju ^{[1
]}

Jiang, Wei ^{[1
]}

Xue, Wei ^{[1
]}

机构：

[1] Chinese Acad Sci, Inst Automat, NLPR, Beijing 100190, Peoples R China

来源：

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA | 2013年 / 134卷 / 05期

关键词：

ENHANCEMENT; RECOGNITION; BINARY;

D O I：

10.1121/1.4824632

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, a computational goal for a monaural speech separation system is proposed. Since this goal is derived by maximizing the signal-to-noise ratio (SNR), it is called the optimal ratio mask (ORM). Under the approximate W-Disjoint Orthogonality assumption which almost always holds due to the sparse nature of speech, theoretical analysis shows that the ORM can improve the SNR about 10log(10)2 dB over the ideal ratio mask. With three kinds of real-world interference, the speech separation results of SNR gain and objective quality evaluation demonstrate the correctness of the theoretical analysis, and imply that the ORMachieves a better separation performance. (C) 2013 Acoustical Society of America

引用

页码：EL452 / EL458

页数：7

共 50 条

[1] Speech enhancement based on the modified phase using signal-to-noise ratio information and time-frequency characteristics
Jia H.
Wang W.
Ji H.
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2019, 46 (05): : 162 - 170
[2] Signal-to-noise ratio analysis in the joint time-frequency domain for ISAR imaging
Wang, GY
Xia, XG
Chen, VC
RADAR PROCESSING, TECHNOLOGY, AND APPLICATIONS III, 1998, 3462 : 164 - 173
[3] A TIME-FREQUENCY METHOD FOR INCREASING THE SIGNAL-TO-NOISE RATIO IN SYSTEM IDENTIFICATION WITH EXPONENTIAL SWEEPS
Majdak, Piotr
Balazs, Peter
Kreuzer, Wolfgang
Doerfler, Monika
2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 3812 - 3815
[4] What visual perception model is optimal in terms of signal-to-noise ratio?
Shelepin, YE
Krasilnikov, NN
Krasilnikova, OI
Chihman, VN
MEDICAL IMAGING 2000: IMAGE PERCEPTION AND PERFORMANCE, 2000, 3981 : 116 - 123
[5] SIGNAL-TO-NOISE RATIO IN A FREQUENCY MULTIPLEXING IMAGER
BENYOSEF, N
SIRAT, G
IEEE JOURNAL OF QUANTUM ELECTRONICS, 1983, 19 (12) : 1741 - 1742
[6] Information Approach to Signal-to-Noise Ratio Estimation of the Speech Signal
Gai, Vasiliy
INFORMATION TECHNOLOGIES AND MATHEMATICAL MODELLING, 2014, 487 : 137 - 144
[7] 'Signal-to-Noise Ratio'
McCooey, D
POETRY REVIEW, 1999, 89 (01): : 73 - 73
[8] A formant frequency estimation algorithm for speech signals with low signal-to-noise ratio
Fattah, S. A.
Zhu, W. -P.
Ahmad, M. O.
2007 50TH MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOLS 1-3, 2007, : 81 - 84
[9] THE MEASUREMENT OF THE SIGNAL-TO-NOISE RATIO (SNR) IN CONTINUOUS SPEECH
KLINGHOLZ, F
SPEECH COMMUNICATION, 1987, 6 (01) : 15 - 26
[10] SIGNAL-TO-NOISE RATIO AS A PREDICTOR OF SPEECH TRANSMISSION QUALITY
SEN, TK
CARROLL, JD
IEEE TRANSACTIONS ON AUDIO AND ELECTROACOUSTICS, 1973, AU21 (04): : 384 - 387

← 1 2 3 4 5 →