A SOURCE/FILTER MODEL WITH ADAPTIVE CONSTRAINTS FOR NMF-BASED SPEECH SEPARATION

被引：0

作者：

Bouvier, Damien ^{[1
]}

Obin, Nicolas ^{[1
]}

Liuni, Marco ^{[1
]}

Roebel, Axel ^{[1
]}

机构：

[1] UPMC, IRCAM, CNRS, UMR STMS IRCAM, Paris, France

来源：

2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS | 2016年

关键词：

speech separation; non-negative matrix factorization; source/filter model; constraints; NONNEGATIVE MATRIX FACTORIZATION; PARTS;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper introduces a constrained source/filter model for semi-supervised speech separation based on non-negative matrix factorization (NMF). The objective is to inform NMF with prior knowledge about speech, providing a physically meaningful speech separation. To do so, a source/filter model (indicated as Instantaneous Mixture Model or IMM) is integrated in the NMF. Furthermore, constraints are added to the IMM-NMF, in order to control the NMF behaviour during separation, and to enforce its physical meaning. In particular, a speech specific constraint-based on the source/filter coherence of speech - and a method for the automatic adaptation of constraints' weights during separation are presented. Also, the proposed source/filter model is semi-supervised: during training, one filter basis is estimated for each phoneme of a speaker; during separation, the estimated filter bases are then used in the constrained source/filter model. An experimental evaluation for speech separation was conducted on the TIMIT speakers database mixed with various environmental background noises from the QUT-NOISE database. This evaluation showed that the use of adaptive constraints increases the performance of the source/filter model for speaker-dependent speech separation, and compares favorably to fully-supervised speech separation.

引用

页码：131 / 135

页数：5

共 50 条

[41] AN NMF-BASED METHOD FOR HYPERSPECTRAL UNMIXING USING A STRUCTURED ADDITIVELY-TUNED LINEAR MIXING MODEL TO ADDRESS SPECTRAL VARIABILITY [J].

Brezini, Salah Eddine ;

Karoui, Moussa Sofiane ;

Benhalouche, Fatima Zohra ;

Deville, Yannick ;

Ouamri, Abdelaziz .

2020 MEDITERRANEAN AND MIDDLE-EAST GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (M2GARSS), 2020, :45-48

[42] An NMF-Based Approach for Hyperspectral Unmixing Using a New Multiplicative-tuning Linear Mixing Model to Address Spectral Variability [J].

Benhalouche, Fatima Zohra ;

Karoui, Moussa Sofiane ;

Deville, Yannick .

2019 27TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2019,

[43] LOCAL GAUSSIAN MODEL WITH SOURCE-SET CONSTRAINTS IN AUDIO SOURCE SEPARATION [J].

Ikeshita, Rintaro ;

Togami, Masahito ;

Kawaguchi, Yohei ;

Fujita, Yusuke ;

Nagamatsu, Kenji .

2017 IEEE 27TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, 2017,

[44] Single channel speech enhancement using iterative constrained NMF based adaptive wiener gain [J].

Yechuri, Sivaramakrishna ;

Vanambathina, Sunnydayal .

MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (09) :26233-26254

[45] Speech Source Separation Using Variational Autoencoder and Bandpass Filter [J].

Do, Hao Duc ;

Tran, Son Thai ;

Chau, Duc Thanh .

IEEE ACCESS, 2020, 8 :156219-156231

[46] Spherical-harmonics-based sound field decomposition and multichannel NMF for sound source separation [J].

Pezzoli, Mirco ;

Carabias-Orti, Julio ;

Vera-Candeas, Pedro ;

Antonacci, Fabio ;

Sarti, Augusto .

APPLIED ACOUSTICS, 2024, 218

[47] Statistical Model of Speech Signals Based on Composite Autoregressive System with Application to Blind Source Separation [J].

Kameoka, Hirokazu ;

Yoshioka, Takuya ;

Hamamura, Mariko ;

Le Roux, Jonathan ;

Kashino, Kunio .

LATENT VARIABLE ANALYSIS AND SIGNAL SEPARATION, 2010, 6365 :245-253

[48] A SUPERVISED MULTI-CHANNEL SPEECH ENHANCEMENT ALGORITHM BASED ON BAYESIAN NMF MODEL [J].

Chung, Hanwook ;

Plourde, Eric ;

Champagne, Benoit .

2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018), 2018, :221-225

[49] SFNet: A Computationally Efficient Source Filter Model Based Neural Speech Synthesis [J].

Rao, Achuth M., V ;

Ghosh, Prasanta Kumar .

IEEE SIGNAL PROCESSING LETTERS, 2020, 27 :1170-1174

[50] Genetic Algorithm-Based Adaptive Wiener Gain for Speech Enhancement Using an Iterative Posterior NMF [J].

Yechuri, Sivaramakrishna ;

Vanabathina, Sunny Dayal .

INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2023, 23 (06)

← 1 2 3 4 5 →