A SOURCE/FILTER MODEL WITH ADAPTIVE CONSTRAINTS FOR NMF-BASED SPEECH SEPARATION

被引：0

作者：

Bouvier, Damien ^{[1
]}

Obin, Nicolas ^{[1
]}

Liuni, Marco ^{[1
]}

Roebel, Axel ^{[1
]}

机构：

[1] UPMC, IRCAM, CNRS, UMR STMS IRCAM, Paris, France

来源：

2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS | 2016年

关键词：

speech separation; non-negative matrix factorization; source/filter model; constraints; NONNEGATIVE MATRIX FACTORIZATION; PARTS;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper introduces a constrained source/filter model for semi-supervised speech separation based on non-negative matrix factorization (NMF). The objective is to inform NMF with prior knowledge about speech, providing a physically meaningful speech separation. To do so, a source/filter model (indicated as Instantaneous Mixture Model or IMM) is integrated in the NMF. Furthermore, constraints are added to the IMM-NMF, in order to control the NMF behaviour during separation, and to enforce its physical meaning. In particular, a speech specific constraint-based on the source/filter coherence of speech - and a method for the automatic adaptation of constraints' weights during separation are presented. Also, the proposed source/filter model is semi-supervised: during training, one filter basis is estimated for each phoneme of a speaker; during separation, the estimated filter bases are then used in the constrained source/filter model. An experimental evaluation for speech separation was conducted on the TIMIT speakers database mixed with various environmental background noises from the QUT-NOISE database. This evaluation showed that the use of adaptive constraints increases the performance of the source/filter model for speaker-dependent speech separation, and compares favorably to fully-supervised speech separation.

引用

页码：131 / 135

页数：5

共 50 条

[21] Single Channel Blind Source Separation Based on NMF and Its Application to Speech Enhancement [J].

Chen, Yongqiang .

2017 IEEE 9TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN), 2017, :1066-1069

[22] COMPLEX NMF UNDER PHASE CONSTRAINTS BASED ON SIGNAL MODELING: APPLICATION TO AUDIO SOURCE SEPARATION [J].

Magron, Paul ;

Badeau, Roland ;

David, Bertrand .

2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, :46-50

[23] NMF-based Multiple Pitch Estimation Using Sparseness and Inter-frame Continuity Constraints [J].

Fujisawa, Takanori ;

Degawa, Ikuo ;

Ikehara, Masaaki .

2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,

[24] Deep Learning Based Speech Separation via NMF-Style Reconstructions [J].

Nie, Shuai ;

Liang, Shan ;

Liu, Wenju ;

Zhang, Xueliang ;

Tao, Jianhua .

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (11) :2043-2055

[25] Efficient Source Separation Algorithm based on NMF Approach [J].

Sami, Cherif ;

Hassen, Lazreg ;

Kamel, Aloui ;

Saber, Naceur Med .

2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2016, :528-532

[26] Homotopy optimisation based NMF for audio source separation [J].

Koundinya, Sriharsha ;

Karmakar, Abhijit .

IET SIGNAL PROCESSING, 2018, 12 (09) :1099-1106

[27] ROTATIONAL RESET STRATEGY FOR ONLINE SEMI-SUPERVISED NMF-BASED SPEECH ENHANCEMENT FOR LONG RECORDINGS [J].

Zhou, Jun ;

Chen, Shuo ;

Duan, Zhiyao .

2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,

[28] Automatic liver tumour segmentation in CT combining FCN and NMF-based deformable model [J].

Zheng S. ;

Fang B. ;

Li L. ;

Gao M. ;

Wang Y. ;

Peng K. .

Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization, 2020, 8 (05) :468-477

[29] Classifying NMF Components Based on Vector Similarity for Speech and Music Separation [J].

Zheng, Nengheng ;

Cai, Yi ;

Li, Xia ;

Lee, Tan .

2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,

[30] DEEP GENERATIVE MODEL LEARNING FOR BLIND SPECTRUM CARTOGRAPHY WITH NMF-BASED RADIO MAP DISAGGREGATION [J].

Shrestha, Sagar ;

Fu, Xiao ;

Hong, Mingyi .

2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :4920-4924

← 1 2 3 4 5 →