DNAE-GAN: Noise-free acoustic signal generator by integrating autoencoder and generative adversarial network

被引:10
作者
Kuo, Ping-Huan [1 ]
Lin, Ssu-Ting [1 ]
Hu, Jun [1 ]
机构
[1] Natl Pingtung Univ, Comp & Intelligent Robot Program Bachelor Degree, Pingtung 90004, Taiwan
来源
INTERNATIONAL JOURNAL OF DISTRIBUTED SENSOR NETWORKS | 2020年 / 16卷 / 05期
关键词
Generative adversarial network; autoencoder; acoustic signal generator; deep learning; machine learning;
D O I
10.1177/1550147720923529
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Linear predictive coding is an extremely effective voice generation method that operates through simple process. However, linear predictive coding-generated voices have limited variations and exhibit excessive noise. To resolve these problems, this article proposes an artificial intelligence model that combines a denoise autoencoder with generative adversarial networks. This model generates voices with similar semantics through the random input from the latent space of generator. The experimental results indicate that voices generated exclusively by generative adversarial networks exhibit excessive noise. To solve this problem, a denoise autoencoder was connected to the generator for denoising. The experimental results prove the feasibility of the proposed voice generation method. In the future, this method can be applied in robots and voice generation applications to increase the humanistic language expression ability of robots and enable robots to demonstrate more humanistic and natural speaking performance.
引用
收藏
页数:13
相关论文
共 38 条
  • [1] [Anonymous], ARXIV161207837
  • [2] On Predictive Coding for Erasure Channels Using a Kalman Framework
    Arildsen, Thomas
    Murthi, Manohar N.
    Andersen, Soren Vang
    Jensen, Soren Holdt
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2009, 57 (11) : 4456 - 4466
  • [3] Cai JY, 2019, 2019 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL CYBER PHYSICAL SYSTEMS (ICPS 2019), P749, DOI [10.1109/icphys.2019.8780153, 10.1109/ICPHYS.2019.8780153]
  • [4] Recent Advances of Generative Adversarial Networks in Computer Vision
    Cao, Yang-Jie
    Jia, Li-Li
    Chen, Yong-Xia
    Lin, Nan
    Yang, Cong
    Zhang, Bo
    Liu, Zhi
    Li, Xue-Xiang
    Dai, Hong-Hua
    [J]. IEEE ACCESS, 2019, 7 : 14985 - 15006
  • [5] Combining Model-Based Q-Learning With Structural Knowledge Transfer for Robot Skill Learning
    Deng, Zhen
    Guan, Haojun
    Huang, Rui
    Liang, Hongzhuo
    Zhang, Liwei
    Zhang, Jianwei
    [J]. IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2019, 11 (01) : 26 - 35
  • [6] Assessment of Speech Intelligibility in Parkinson's Disease Using a Speech-To-Text System
    Dimauro, Giovanni
    Di Nicola, Vincenzo
    Bevilacqua, Vitoantonio
    Caivano, Danilo
    Girardi, Francesco
    [J]. IEEE ACCESS, 2017, 5 : 22199 - 22208
  • [7] Application of Linear Predictive Coding for Doppler Through-Wall Radar Target Tracking
    Ding, Yipeng
    Tang, Jingtian
    Xu, Xuemei
    Zhang, Jiliang
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2015, 12 (06) : 1317 - 1321
  • [8] Inpainting of Remote Sensing SST Images With Deep Convolutional Generative Adversarial Network
    Dong, Junyu
    Yin, Ruiying
    Sun, Xin
    Li, Qiong
    Yang, Yuting
    Qin, Xukun
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2019, 16 (02) : 173 - 177
  • [9] Multimodal Kernel Method for Activity Detection of Sound Sources
    Dov, David
    Talmon, Ronen
    Cohen, Israel
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (06) : 1322 - 1334
  • [10] Duchi J, 2011, J MACH LEARN RES, V12, P2121