Monaural Source Separation Using a Random Forest Classifier

被引:1
|
作者
Riday, Cosimo [1 ]
Bhargava, Saurabh
Hahnloser, Richard H. R.
Liu, Shih-Chii
机构
[1] Univ Zurich, Inst Neuroinformat, Zurich, Switzerland
来源
17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES | 2016年
基金
瑞士国家科学基金会;
关键词
monaural source separation; random forest; deep learning; CASA; IMPROVE SPEECH RECOGNITION; NOISE;
D O I
10.21437/Interspeech.2016-252
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We address the problem of separating two audio sources from a single channel mixture recording. A novel method called Multi Layered Random Forest (MLRF) that learns a binary mask for both the sources is presented. Random Forest (RF) classifiers are trained for each frequency band of a source spectrogram. A specialized set of linear transformations are applied to a local time-frequency (T-F) neighborhood of the mixture that captures relevant local statistics. A sampling method is presented that efficiently samples T-F training bins in each frequency band. We draw equal numbers of dominant (more power) training samples from the two sources for RF classifiers that estimate the Ideal Binary Mask (IBM). An estimated IBM in a given layer is used to train a RF classifier in the next higher layer of the MLRF hierarchy. On average, MLRF performs better than deep Recurrent Neural Networks (RNNs) and Non-Negative Sparse Coding (NNSC) in signalto-noise ratio (SNR) of reconstructed audio, overall T-F bin classification accuracy, as well as PESQ and STOI scores. Additionally, we demonstrate the ability of the MLRF to correctly reconstruct T-F bins of the target even when the latter has lower power in that frequency band.
引用
收藏
页码:3344 / 3348
页数:5
相关论文
共 50 条
  • [1] Monaural source separation using spectral cues
    Pearlmutter, BA
    Zador, AM
    INDEPENDENT COMPONENT ANALYSIS AND BLIND SIGNAL SEPARATION, 2004, 3195 : 478 - 485
  • [2] Monaural Source Separation Using Ramanujan Subspace Dictionaries
    Liao, Hsueh-Wei
    Su, Li
    IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (08) : 1156 - 1160
  • [3] Monaural Audio Source Separation using Variational Autoencoders
    Pandey, Laxmi
    Kumar, Anurendra
    Namboodiri, Vinay
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 3489 - 3493
  • [4] Outlier Prediction Using Random Forest Classifier
    Mohandoss, Divya Pramasani
    Shi, Yong
    Suo, Kun
    2021 IEEE 11TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2021, : 27 - 33
  • [5] Monaural Music Source Separation Using Convolutional Sparse Coding
    Jao, Ping-Keng
    Su, Li
    Yang, Yi-Hsuan
    Wohlberg, Brendt
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2158 - 2170
  • [6] Monaural speech/music source separation using discrete energy separation algorithm
    Litvin, Yevgeni
    Cohen, Israel
    Chazan, Dan
    SIGNAL PROCESSING, 2010, 90 (12) : 3147 - 3163
  • [7] Monaural Speaker Separation using Source-Contrastive Estimation
    Stephenson, Cory
    Callier, Patrick
    Ganesh, Abhinav
    Ni, Karl
    2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2017,
  • [8] Monaural Speech Separation using source-adapted models
    Weiss, Ron J.
    Ellis, Daniel P. W.
    2007 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS, 2007, : 265 - 268
  • [9] MONAURAL SOUND SOURCE SEPARATION USING COVARIANCE PROFILE OF PARTIALS
    Goel, Priyank
    Ramakrishnan, K. R.
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2452 - 2456
  • [10] Diabetes detection using random forest classifier and risk score calculation using random forest regressor
    Kaur, Simarjeet
    Kaur, Damandeep
    Mayank, Mrinal
    Singh, Nongmeikapam Thoiba
    Artificial Intelligence, Blockchain, Computing and Security - Proceedings of the International Conference on Artificial Intelligence, Blockchain, Computing and Security, ICABCS 2023, 2024, 2 : 426 - 431