UNSUPERVISED MONAURAL SPEECH ENHANCEMENT USING ROBUST NMF WITH LOW-RANK AND SPARSE CONSTRAINTS

被引：0

作者：

Li, Yinan ^{[1
]}

Zhang, Xiongwei ^{[1
]}

Sun, Meng ^{[1
]}

Min, Gang ^{[1
]}

机构：

[1] PLA Univ Sci & Technol, Lab Intelligent Informat Proc, Nanjing, Jiangsu, Peoples R China

来源：

2015 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING | 2015年

关键词：

speech enhancement; low-rank and sparse decomposition; non-negative matrix factorization;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Non-negative spectrogram decomposition and its variants have been extensively investigated for speech enhancement due to their efficiency in extracting perceptually meaningful components from mixtures. Usually, these approaches are implemented on the condition that training samples for one or more sources are available beforehand. However, in many real-world scenarios, it is always impossible for conducting any prior training. To solve this problem, we proposed an approach which directly extracts the representations of background noises from the noisy speech via imposing non-negative constraints on the low-rank and sparse decomposition of the noisy spectrogram. The noise representations are subsequently utilized when estimating the clean speech. In this technique, potential spectral structural regularity could be discovered for better reconstruction of clean speech. Evaluations on the Noisex-92 and TIMIT database showed that the proposed method achieves significant improvements over the state-of-the-art methods in unsupervised speech enhancement.

引用

页码：1 / 4

页数：4

共 50 条

[1] Deep Neural Network Based Monaural Speech Enhancement with Sparse and Low-Rank Decomposition
Shi, Wenhua
Zhang, Xiongwei
Sun, Meng
Zou, Xia
Wei, Yanmin
Min, Gang
2017 17TH IEEE INTERNATIONAL CONFERENCE ON COMMUNICATION TECHNOLOGY (ICCT 2017), 2017, : 1644 - 1647
[2] SPEECH ENHANCEMENT BY SPARSE, LOW-RANK, AND DICTIONARY SPECTROGRAM DECOMPOSITION
Chen, Zhuo
Ellis, Daniel P. W.
2013 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2013,
[3] Speech Enhancement Under Low SNR Conditions Via Noise Estimation Using Sparse and Low-Rank NMF with Kullback-Leibler Divergence
Sun, Meng
Li, Yinan
Gemmeke, Jort F.
Zhang, Xiongwei
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2015, 23 (07) : 1233 - 1242
[4] Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability
Shi, Wenhua
Zhang, Xiongwei
Zou, Xia
Sun, Meng
Han, Wei
Li, Li
Min, Gang
IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2018, E101A (03) : 585 - 589
[5] Robust Low-Rank and Sparse Tensor Decomposition for Low-Rank Tensor Completion
Shi, Yuqing
Du, Shiqiang
Wang, Weilan
PROCEEDINGS OF THE 33RD CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2021), 2021, : 7138 - 7143
[6] A PERCEPTUALLY MOTIVATED APPROACH VIA SPARSE AND LOW-RANK MODEL FOR SPEECH ENHANCEMENT
Min, Gang
Zhang, Xiongwei
Yang, Jibin
Han, Wei
Zou, Xia
2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO (ICME), 2016,
[7] Unsupervised Robust Projection Learning by Low-Rank and Sparse Decomposition for Hyperspectral Feature Extraction
Song, Xin
Li, Heng-Chao
Pan, Lei
Deng, Yang-Jun
Zhang, Pu
You, Li
Du, Qian
IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
[8] Unsupervised low-rank representations for speech emotion recognition
Paraskevopoulos, Georgios
Tzinis, Efthymios
Ellinas, Nikolaos
Giannakopoulos, Theodoros
Potamianos, Alexandros
INTERSPEECH 2019, 2019, : 939 - 943
[9] A novel speech enhancement method by learnable sparse and low-rank decomposition and domain adaptation
Mavaddaty, Samira
Ahadi, Seyed Mohammad
Seyedin, Sanaz
SPEECH COMMUNICATION, 2016, 76 : 42 - 60
[10] Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms
Bando, Yoshiaki
Itoyama, Katsutoshi
Konyo, Masashi
Tadokoro, Satoshi
Nakadai, Kazuhiro
Yoshii, Kazuyoshi
Kawahara, Tatsuya
Okuno, Hiroshi G.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (02) : 215 - 230

← 1 2 3 4 5 →