Monaural Source Separation Using a Random Forest Classifier

被引：1

作者：

Riday, Cosimo ^{[1
]}

Bhargava, Saurabh

Hahnloser, Richard H. R.

Liu, Shih-Chii

机构：

[1] Univ Zurich, Inst Neuroinformat, Zurich, Switzerland

来源：

17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年

基金：

瑞士国家科学基金会;

关键词：

monaural source separation; random forest; deep learning; CASA; IMPROVE SPEECH RECOGNITION; NOISE;

D O I：

10.21437/Interspeech.2016-252

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

We address the problem of separating two audio sources from a single channel mixture recording. A novel method called Multi Layered Random Forest (MLRF) that learns a binary mask for both the sources is presented. Random Forest (RF) classifiers are trained for each frequency band of a source spectrogram. A specialized set of linear transformations are applied to a local time-frequency (T-F) neighborhood of the mixture that captures relevant local statistics. A sampling method is presented that efficiently samples T-F training bins in each frequency band. We draw equal numbers of dominant (more power) training samples from the two sources for RF classifiers that estimate the Ideal Binary Mask (IBM). An estimated IBM in a given layer is used to train a RF classifier in the next higher layer of the MLRF hierarchy. On average, MLRF performs better than deep Recurrent Neural Networks (RNNs) and Non-Negative Sparse Coding (NNSC) in signalto-noise ratio (SNR) of reconstructed audio, overall T-F bin classification accuracy, as well as PESQ and STOI scores. Additionally, we demonstrate the ability of the MLRF to correctly reconstruct T-F bins of the target even when the latter has lower power in that frequency band.

引用

页码：3344 / 3348

页数：5

共 50 条

[1] Monaural source separation using spectral cues
Pearlmutter, BA
Zador, AM
INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION, 2004, 3195 : 478 - 485
[2] Monaural Source Separation Using Ramanujan Subspace Dictionaries
Liao, Hsueh-Wei
Su, Li
IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (08) : 1156 - 1160
[3] Monaural Audio Source Separation using Variational Autoencoders
Pandey, Laxmi
Kumar, Anurendra
Namboodiri, Vinay
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3489 - 3493
[4] Outlier Prediction Using Random Forest Classifier
Mohandoss, Divya Pramasani
Shi, Yong
Suo, Kun
2021 IEEE 11TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2021, : 27 - 33
[5] Monaural Music Source Separation Using Convolutional Sparse Coding
Jao, Ping-Keng
Su, Li
Yang, Yi-Hsuan
Wohlberg, Brendt
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2158 - 2170
[6] Monaural speech/music source separation using discrete energy separation algorithm
Litvin, Yevgeni
Cohen, Israel
Chazan, Dan
SIGNAL PROCESSING, 2010, 90 (12) : 3147 - 3163
[7] Monaural Speaker Separation using Source-Contrastive Estimation
Stephenson, Cory
Callier, Patrick
Ganesh, Abhinav
Ni, Karl
2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2017,
[8] Monaural Speech Separation using source-adapted models
Weiss, Ron J.
Ellis, Daniel P. W.
2007 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2007, : 265 - 268
[9] MONAURAL SOUND SOURCE SEPARATION USING COVARIANCE PROFILE OF PARTIALS
Goel, Priyank
Ramakrishnan, K. R.
2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2452 - 2456
[10] Diabetes detection using random forest classifier and risk score calculation using random forest regressor
Kaur, Simarjeet
Kaur, Damandeep
Mayank, Mrinal
Singh, Nongmeikapam Thoiba
Artificial Intelligence, Blockchain, Computing and Security - Proceedings of the International Conference on Artificial Intelligence, Blockchain, Computing and Security, ICABCS 2023, 2024, 2 : 426 - 431

← 1 2 3 4 5 →