GAN-based Augmentation for Populating Speech Dataset with High Fidelity Synthesized Audio

被引:0
作者
Back, Moon-Ki [1 ]
Yoon, Seung-Won [1 ]
Lee, Kyu-Chul [1 ]
机构
[1] Chungnam Natl Univ, Dept Comp Engn, Daejeon, South Korea
来源
11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020) | 2020年
关键词
Audio augmentation; generative adversarial networks; harmonic percussive separation; progressive growing; speech dataset;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we present an audio augmentation method that generates synthetic audio using Generative Adversarial Networks (GANs). We propose a training strategy that first uses Harmonic Percussive Source Separation (HPSS) to extract spectral features and then improves the fidelity of the synthesized audio by applying progressively-growing GANs. Our method is demonstrated on a public speech dataset released by Google TensorFlow. When employing our method, the performance evaluated by Frechet Inception Distance (FID) showed 12.514, but 14.012 as for the existing image-generating GANs (lower FID indicates better fidelity).
引用
收藏
页码:1267 / 1269
页数:3
相关论文
共 12 条
[1]  
[Anonymous], representation learning with deep convolutional generative
[2]   Pros and cons of GAN evaluation measures [J].
Borji, Ali .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 179 :41-65
[3]   FlyMap: Interacting with Maps Projected from a Drone [J].
Brock, Anke M. ;
Chatain, Julia ;
Park, Michelle ;
Fang, Tommy ;
Hachet, Martin ;
Landay, James A. ;
Cauchard, Jessica R. .
PROCEEDINGS PERVASIVE DISPLAYS 2018: THE 7TH ACM INTERNATIONAL SYMPOSIUM ON PERVASIVE DISPLAYS, 2018,
[4]  
Donahue C., 2018, P INT C LEARN REPR
[5]  
Engel J., 2019, P INT C LEARN REPR
[6]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[7]  
Heusel M, 2017, ADV NEUR IN, V30
[8]  
Karras T., 2017, INT C LEARNING REPRE
[9]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[10]  
Richardson E., 2018, Advances in Neural Information Processing Systems, P5847