Speech Enhancement Using Dynamical Variational AutoEncoder

被引：0

作者：

Do, Hao D. ^{[1
]}

机构：

[1] FPT Univ, Ho Chi Minh City, Vietnam

来源：

INTELLIGENT INFORMATION AND DATABASE SYSTEMS, ACIIDS 2023, PT II | 2023年 / 13996卷

关键词：

speech enhancement; dynamical variational autoEncoder; generative model;

D O I：

10.1007/978-981-99-5837-5_21

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This research focuses on dealing with speech enhancement via a generative model. Many other solutions, which are trained with some fixed kinds of interference or noises, need help when extracting speech from the mixture with a strange noise. We use a class of generative models called Dynamical Variational AutoEncoder (DVAE), which combines generative and temporal models to analyze the speech signal. This class of models makes attention to speech signal behavior, then extracts and enhances the speech. Moreover, we design a new architecture in the DVAE class named Bi-RVAE, which is more straightforward than the other models but gains good results. Experimental results show that DVAE class, including our proposed design, achieves a high-quality recovered speech. This class could enhance the speech signal before passing it into the central processing models.

引用

页码：247 / 258

页数：12

共 50 条

[31] The Character Generation in Handwriting Feature Extraction using Variational AutoEncoder
Yamada, Tomoki
Hosoe, Mariko
Kato, Kunihito
Yamamoto, Kazuhiko
[J]. 2017 14TH IAPR INTERNATIONAL CONFERENCE ON DOCUMENT ANALYSIS AND RECOGNITION (ICDAR), VOL 1, 2017, : 1019 - 1024
[32] Speaker-aware Deep Denoising Autoencoder with Embedded Speaker Identity for Speech Enhancement
Chuang, Fu-Kai
Wang, Syu-Siang
Hung, Jeih-weih
Tsao, Yu
Fang, Shih-Hau
[J]. INTERSPEECH 2019, 2019, : 3173 - 3177
[33] A Multiscale Autoencoder (MSAE) Framework for End-to-End Neural Network Speech Enhancement
Borgstrom, Bengt J.
Brandstein, Michael S.
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2418 - 2431
[34] A variant of SWEMDH technique based on variational mode decomposition for speech enhancement
Selvaraj, Poovarasan
Chandra, E.
[J]. INTERNATIONAL JOURNAL OF KNOWLEDGE-BASED AND INTELLIGENT ENGINEERING SYSTEMS, 2021, 25 (03) : 299 - 308
[35] Performance analysis of adaptive variational mode decomposition approach for speech enhancement
Ram R.
Mohanty M.N.
[J]. International Journal of Speech Technology, 2018, 21 (2) : 369 - 381
[36] Variational Bayesian learning of speech GMMs for feature enhancement based on Algonquin
Pettersen, Svein G.
Johnsen, Magne H.
Wellekens, Christian
[J]. 2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 905 - +
[37] Robust distributed speech recognition using speech enhancement
Flynn, Ronan
Jones, Edward
[J]. IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2008, 54 (03) : 1267 - 1273
[38] On a Possible Quantum Variational Autoencoder Circuit
Pramanik, Sayantan
Chandra, M. Girish
[J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
[39] Generative Oversampling with a Contrastive Variational Autoencoder
Dai, Wangzhi
Ng, Kenney
Severson, Kristen A.
Huang, Wei
Anderson, Fred
Stultz, Collin M.
[J]. 2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 101 - 109
[40] Learning Community Structure with Variational Autoencoder
Choong, Jun Jin
Liu, Xin
Murata, Tsuyoshi
[J]. 2018 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2018, : 69 - 78

← 1 2 3 4 5 →