A SOURCE/FILTER MODEL WITH ADAPTIVE CONSTRAINTS FOR NMF-BASED SPEECH SEPARATION

被引:0
作者
Bouvier, Damien [1 ]
Obin, Nicolas [1 ]
Liuni, Marco [1 ]
Roebel, Axel [1 ]
机构
[1] UPMC, IRCAM, CNRS, UMR STMS IRCAM, Paris, France
来源
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS | 2016年
关键词
speech separation; non-negative matrix factorization; source/filter model; constraints; NONNEGATIVE MATRIX FACTORIZATION; PARTS;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper introduces a constrained source/filter model for semi-supervised speech separation based on non-negative matrix factorization (NMF). The objective is to inform NMF with prior knowledge about speech, providing a physically meaningful speech separation. To do so, a source/filter model (indicated as Instantaneous Mixture Model or IMM) is integrated in the NMF. Furthermore, constraints are added to the IMM-NMF, in order to control the NMF behaviour during separation, and to enforce its physical meaning. In particular, a speech specific constraint-based on the source/filter coherence of speech - and a method for the automatic adaptation of constraints' weights during separation are presented. Also, the proposed source/filter model is semi-supervised: during training, one filter basis is estimated for each phoneme of a speaker; during separation, the estimated filter bases are then used in the constrained source/filter model. An experimental evaluation for speech separation was conducted on the TIMIT speakers database mixed with various environmental background noises from the QUT-NOISE database. This evaluation showed that the use of adaptive constraints increases the performance of the source/filter model for speaker-dependent speech separation, and compares favorably to fully-supervised speech separation.
引用
收藏
页码:131 / 135
页数:5
相关论文
共 50 条
[21]   Single Channel Blind Source Separation Based on NMF and Its Application to Speech Enhancement [J].
Chen, Yongqiang .
2017 IEEE 9TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN), 2017, :1066-1069
[22]   COMPLEX NMF UNDER PHASE CONSTRAINTS BASED ON SIGNAL MODELING: APPLICATION TO AUDIO SOURCE SEPARATION [J].
Magron, Paul ;
Badeau, Roland ;
David, Bertrand .
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, :46-50
[23]   NMF-based Multiple Pitch Estimation Using Sparseness and Inter-frame Continuity Constraints [J].
Fujisawa, Takanori ;
Degawa, Ikuo ;
Ikehara, Masaaki .
2014 IEEE 16TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2014,
[24]   Deep Learning Based Speech Separation via NMF-Style Reconstructions [J].
Nie, Shuai ;
Liang, Shan ;
Liu, Wenju ;
Zhang, Xueliang ;
Tao, Jianhua .
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2018, 26 (11) :2043-2055
[25]   Efficient Source Separation Algorithm based on NMF Approach [J].
Sami, Cherif ;
Hassen, Lazreg ;
Kamel, Aloui ;
Saber, Naceur Med .
2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP), 2016, :528-532
[26]   Homotopy optimisation based NMF for audio source separation [J].
Koundinya, Sriharsha ;
Karmakar, Abhijit .
IET SIGNAL PROCESSING, 2018, 12 (09) :1099-1106
[27]   ROTATIONAL RESET STRATEGY FOR ONLINE SEMI-SUPERVISED NMF-BASED SPEECH ENHANCEMENT FOR LONG RECORDINGS [J].
Zhou, Jun ;
Chen, Shuo ;
Duan, Zhiyao .
2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA), 2015,
[28]   Automatic liver tumour segmentation in CT combining FCN and NMF-based deformable model [J].
Zheng S. ;
Fang B. ;
Li L. ;
Gao M. ;
Wang Y. ;
Peng K. .
Computer Methods in Biomechanics and Biomedical Engineering: Imaging and Visualization, 2020, 8 (05) :468-477
[29]   Classifying NMF Components Based on Vector Similarity for Speech and Music Separation [J].
Zheng, Nengheng ;
Cai, Yi ;
Li, Xia ;
Lee, Tan .
2012 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2012,
[30]   DEEP GENERATIVE MODEL LEARNING FOR BLIND SPECTRUM CARTOGRAPHY WITH NMF-BASED RADIO MAP DISAGGREGATION [J].
Shrestha, Sagar ;
Fu, Xiao ;
Hong, Mingyi .
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :4920-4924