共 95 条
- [1] Li N., Loizou P.C., Factors influencing intelligibility of ideal binarymasked speech: Implications for noise reduction, J. Acoust. Soc. Amer., 123, pp. 1673-1682, (2008)
- [2] Narayanan A., Wang D., Ideal ratio mask estimation using deep neural networks for robust speech recognition, Proc. IEEE Int. Conf. Acoust. Speech Signal Process., pp. 7092-7096, (2013)
- [3] Erdogan H., Hershey J.R., Watanabe S., Le Roux J., Phase-sensitive and recognition-boosted speech separation using deep recurrent neural networks, Proc. IEEE Int. Conf. Acoust. Speech Signal Process., pp. 708-712, (2015)
- [4] Williamson D.S., Wang Y., Wang D., Complex ratio masking for monaural speech separation, IEEE/ACM Trans. Audio, Speech, Lang. Process., 24, 3, pp. 483-492, (2016)
- [5] Lee J., Kang H.-G., A joint learning algorithm for complex-valued TF masks in deep learning-based single-channel speech enhancement systems, IEEE/ACMTrans. Audio, Speech, Lang. Process., 27, 6, pp. 1098-1108, (2019)
- [6] Healy E.W., Vasko J.L., An ideal quantized mask to increase intelligibility and quality of speech in noise, J. Acoust. Soc. Amer., 144, pp. 1392-1405, (2018)
- [7] Luo Y., Mesgarani N., TaSNet: Time-domain audio separation network for real-time, single-channel speech separation, Proc. IEEE Int. Conf. Acoust. Speech Signal Process., pp. 696-700, (2018)
- [8] Pandey A., Wang D., A new framework for CNN-based speech enhancement in the time domain, IEEE/ACMTrans. Audio, Speech, Lang. Process., 27, 7, pp. 1179-1188, (2019)
- [9] Odelowo B.O., Anderson D.V., A study of training targets for deep neural network-based speech enhancement using noise prediction, Proc. IEEE Int. Conf. Acoust. Speech Signal Process., pp. 5409-5413, (2018)
- [10] Lu Y.-J., Wang Z.-Q., Watanabe S., Richard A., Yu C., Tsao Y., Conditional diffusion probabilistic model for speech enhancement, Proc. IEEE Int. Conf. Acoust. Speech Signal Process., pp. 7402-7406, (2022)