GAN-based Augmentation for Populating Speech Dataset with High Fidelity Synthesized Audio

被引：0

作者：

Back, Moon-Ki ^{[1
]}

Yoon, Seung-Won ^{[1
]}

Lee, Kyu-Chul ^{[1
]}

机构：

[1] Chungnam Natl Univ, Dept Comp Engn, Daejeon, South Korea

来源：

11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020) | 2020年

关键词：

Audio augmentation; generative adversarial networks; harmonic percussive separation; progressive growing; speech dataset;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In this paper, we present an audio augmentation method that generates synthetic audio using Generative Adversarial Networks (GANs). We propose a training strategy that first uses Harmonic Percussive Source Separation (HPSS) to extract spectral features and then improves the fidelity of the synthesized audio by applying progressively-growing GANs. Our method is demonstrated on a public speech dataset released by Google TensorFlow. When employing our method, the performance evaluated by Frechet Inception Distance (FID) showed 12.514, but 14.012 as for the existing image-generating GANs (lower FID indicates better fidelity).

引用

页码：1267 / 1269

页数：3

共 12 条

[1]

[Anonymous], representation learning with deep convolutional generative

[2] Pros and cons of GAN evaluation measures [J].

Borji, Ali .

COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 179 :41-65

[3] FlyMap: Interacting with Maps Projected from a Drone [J].

Brock, Anke M. ;

Chatain, Julia ;

Park, Michelle ;

Fang, Tommy ;

Hachet, Martin ;

Landay, James A. ;

Cauchard, Jessica R. .

PROCEEDINGS PERVASIVE DISPLAYS 2018: THE 7TH ACM INTERNATIONAL SYMPOSIUM ON PERVASIVE DISPLAYS, 2018,

[4]

Donahue C., 2018, P INT C LEARN REPR

[5]

Engel J., 2019, P INT C LEARN REPR

[6] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[7]

Heusel M, 2017, ADV NEUR IN, V30

[8]

Karras T., 2017, INT C LEARNING REPRE

[9] ImageNet Classification with Deep Convolutional Neural Networks [J].

Krizhevsky, Alex ;

Sutskever, Ilya ;

Hinton, Geoffrey E. .

COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90

[10]

Richardson E., 2018, Advances in Neural Information Processing Systems, P5847

← 1 2 →