共 50 条
The optimal ratio time-frequency mask for speech separation in terms of the signal-to-noise ratio
被引:22
作者:
Liang, Shan
[1
]
Liu, Wenju
[1
]
Jiang, Wei
[1
]
Xue, Wei
[1
]
机构:
[1] Chinese Acad Sci, Inst Automat, NLPR, Beijing 100190, Peoples R China
关键词:
ENHANCEMENT;
RECOGNITION;
BINARY;
D O I:
10.1121/1.4824632
中图分类号:
O42 [声学];
学科分类号:
070206 ;
082403 ;
摘要:
In this paper, a computational goal for a monaural speech separation system is proposed. Since this goal is derived by maximizing the signal-to-noise ratio (SNR), it is called the optimal ratio mask (ORM). Under the approximate W-Disjoint Orthogonality assumption which almost always holds due to the sparse nature of speech, theoretical analysis shows that the ORM can improve the SNR about 10log(10)2 dB over the ideal ratio mask. With three kinds of real-world interference, the speech separation results of SNR gain and objective quality evaluation demonstrate the correctness of the theoretical analysis, and imply that the ORMachieves a better separation performance. (C) 2013 Acoustical Society of America
引用
收藏
页码:EL452 / EL458
页数:7
相关论文
共 50 条