Speech Enhancement Using Dynamical Variational AutoEncoder

被引:0
作者
Do, Hao D. [1 ]
机构
[1] FPT Univ, Ho Chi Minh City, Vietnam
来源
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II | 2023年 / 13996卷
关键词
speech enhancement; dynamical variational autoEncoder; generative model;
D O I
10.1007/978-981-99-5837-5_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research focuses on dealing with speech enhancement via a generative model. Many other solutions, which are trained with some fixed kinds of interference or noises, need help when extracting speech from the mixture with a strange noise. We use a class of generative models called Dynamical Variational AutoEncoder (DVAE), which combines generative and temporal models to analyze the speech signal. This class of models makes attention to speech signal behavior, then extracts and enhances the speech. Moreover, we design a new architecture in the DVAE class named Bi-RVAE, which is more straightforward than the other models but gains good results. Experimental results show that DVAE class, including our proposed design, achieves a high-quality recovered speech. This class could enhance the speech signal before passing it into the central processing models.
引用
收藏
页码:247 / 258
页数:12
相关论文
共 50 条
  • [21] Audio-Visual Speech Enhancement Using Conditional Variational Auto-Encoders
    Sadeghi, Mostafa
    Leglaive, Simon
    Alameda-Pineda, Xavier
    Girin, Laurent
    Horaud, Radu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1788 - 1800
  • [22] A Joint Framework of Denoising Autoencoder and Generative Vocoder for Monaural Speech Enhancement
    Du, Zhihao
    Zhang, Xueliang
    Han, Jiqing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1493 - 1505
  • [23] Generative autoencoder to prevent overregularization of variational autoencoder
    Ko, YoungMin
    Ko, SunWoo
    Kim, YoungSoo
    ETRI JOURNAL, 2025, 47 (01) : 80 - 89
  • [24] Speech Enhancement Based on Denoising Autoencoder With Multi-Branched Encoders
    Yu, Cheng
    Zezario, Ryandhimas E.
    Wang, Syu-Siang
    Sherman, Jonathan
    Hsieh, Yi-Yen
    Lu, Xugang
    Wang, Hsin-Min
    Tsao, Yu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2756 - 2769
  • [25] Single Channel Speech Enhancement System using Convolutional Neural Network based Autoencoder for Noisy Environments
    Buragohain, Rantu
    Ashishkumar, Gudmalwar
    Rao, Ch V. Rama
    2022 IEEE 19TH INDIA COUNCIL INTERNATIONAL CONFERENCE, INDICON, 2022,
  • [26] SPEECH ENHANCEMENT WITH VARIATIONAL AUTOENCODERS AND ALPHA-STABLE DISTRIBUTIONS
    Leglaive, Simon
    Simsekli, Umut
    Liutkus, Antoine
    Girin, Laurent
    Horaud, Radu
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 541 - 545
  • [27] Generating multivariate load states using a conditional variational autoencoder
    Wang, Chenguang
    Sharifnia, Ensieh
    Gao, Zhi
    Tindemans, Simon H.
    Palensky, Peter
    ELECTRIC POWER SYSTEMS RESEARCH, 2022, 213
  • [28] SELF-SUPERVISED DENOISING AUTOENCODER WITH LINEAR REGRESSION DECODER FOR SPEECH ENHANCEMENT
    Zezario, Ryandhimas E.
    Hussain, Tassadaq
    Lu, Xugang
    Wang, Hsin-Min
    Tsao, Yu
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 6669 - 6673
  • [29] A Study of Enhancement, Augmentation, and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition
    Tang, Hao
    Hsu, Wei-Ning
    Grondin, Francois
    Glass, James
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2928 - 2932
  • [30] A VARIANCE MODELING FRAMEWORK BASED ON VARIATIONAL AUTOENCODERS FOR SPEECH ENHANCEMENT
    Leglaive, Simon
    Girin, Laurent
    Horaud, Radu
    2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,