Speech Enhancement Using Dynamical Variational AutoEncoder

被引:0
作者
Do, Hao D. [1 ]
机构
[1] FPT Univ, Ho Chi Minh City, Vietnam
来源
INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II | 2023年 / 13996卷
关键词
speech enhancement; dynamical variational autoEncoder; generative model;
D O I
10.1007/978-981-99-5837-5_21
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This research focuses on dealing with speech enhancement via a generative model. Many other solutions, which are trained with some fixed kinds of interference or noises, need help when extracting speech from the mixture with a strange noise. We use a class of generative models called Dynamical Variational AutoEncoder (DVAE), which combines generative and temporal models to analyze the speech signal. This class of models makes attention to speech signal behavior, then extracts and enhances the speech. Moreover, we design a new architecture in the DVAE class named Bi-RVAE, which is more straightforward than the other models but gains good results. Experimental results show that DVAE class, including our proposed design, achieves a high-quality recovered speech. This class could enhance the speech signal before passing it into the central processing models.
引用
收藏
页码:247 / 258
页数:12
相关论文
共 50 条
  • [31] The Character Generation in Handwriting Feature Extraction using Variational AutoEncoder
    Yamada, Tomoki
    Hosoe, Mariko
    Kato, Kunihito
    Yamamoto, Kazuhiko
    [J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1019 - 1024
  • [32] Speaker-aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement
    Chuang, Fu-Kai
    Wang, Syu-Siang
    Hung, Jeih-weih
    Tsao, Yu
    Fang, Shih-Hau
    [J]. INTERSPEECH 2019, 2019, : 3173 - 3177
  • [33] A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
    Borgstrom, Bengt J.
    Brandstein, Michael S.
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2418 - 2431
  • [34] A variant of SWEMDH technique based on variational mode decomposition for speech enhancement
    Selvaraj, Poovarasan
    Chandra, E.
    [J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2021, 25 (03) : 299 - 308
  • [35] Performance analysis of adaptive variational mode decomposition approach for speech enhancement
    Ram R.
    Mohanty M.N.
    [J]. International Journal of Speech Technology, 2018, 21 (2) : 369 - 381
  • [36] Variational Bayesian learning of speech GMMs for feature enhancement based on Algonquin
    Pettersen, Svein G.
    Johnsen, Magne H.
    Wellekens, Christian
    [J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 905 - +
  • [37] Robust distributed speech recognition using speech enhancement
    Flynn, Ronan
    Jones, Edward
    [J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2008, 54 (03) : 1267 - 1273
  • [38] On a Possible Quantum Variational Autoencoder Circuit
    Pramanik, Sayantan
    Chandra, M. Girish
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [39] Generative Oversampling with a Contrastive Variational Autoencoder
    Dai, Wangzhi
    Ng, Kenney
    Severson, Kristen A.
    Huang, Wei
    Anderson, Fred
    Stultz, Collin M.
    [J]. 2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 101 - 109
  • [40] Learning Community Structure with Variational Autoencoder
    Choong, Jun Jin
    Liu, Xin
    Murata, Tsuyoshi
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 69 - 78