GAN-based Augmentation for Populating Speech Dataset with High Fidelity Synthesized Audio

被引:0
作者
Back, Moon-Ki [1 ]
Yoon, Seung-Won [1 ]
Lee, Kyu-Chul [1 ]
机构
[1] Chungnam Natl Univ, Dept Comp Engn, Daejeon, South Korea
来源
11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020) | 2020年
关键词
Audio augmentation; generative adversarial networks; harmonic percussive separation; progressive growing; speech dataset;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we present an audio augmentation method that generates synthetic audio using Generative Adversarial Networks (GANs). We propose a training strategy that first uses Harmonic Percussive Source Separation (HPSS) to extract spectral features and then improves the fidelity of the synthesized audio by applying progressively-growing GANs. Our method is demonstrated on a public speech dataset released by Google TensorFlow. When employing our method, the performance evaluated by Frechet Inception Distance (FID) showed 12.514, but 14.012 as for the existing image-generating GANs (lower FID indicates better fidelity).
引用
收藏
页码:1267 / 1269
页数:3
相关论文
共 43 条
  • [41] Open-Set Audio Classification with Limited Training Resources based on Augmentation Enhanced Variational Auto-Encoder GAN with Detection-Classification Joint Training
    Kah Kuan Teh
    Huy Dat Tran
    INTERSPEECH 2021, 2021, : 4169 - 4173
  • [42] A study on 3D classical versus GAN-based augmentation for MRI brain image to predict the diagnosis of dementia with Lewy Bodies and Alzheimer's disease in a European Multi-center Study
    Mine, Petter
    Fernandez-Quilez, Alvaro
    Aarsland, Dag
    Ferreira, Daniel
    Westman, Eric
    Lemstra, Afina W.
    Ten Kate, Mara
    Padovani, Alessandro
    Rektorova, Irene
    Bonanni, Laura
    Nobili, Flavio Mariano
    Kramberger, Milica G.
    Taylor, John-Paul
    Hort, Jakub
    Snaedal, Jon
    Blanc, Frederic
    Antonini, Angelo
    Oppedal, Ketil
    MEDICAL IMAGING 2022: COMPUTER-AIDED DIAGNOSIS, 2022, 12033
  • [43] An efficient conditional GAN-based framework for high-resolution prediction of tyre-pavement contact stresses - a contribution towards a digital twin of the road system
    Liu, Pengfei
    Zhang, Hancheng
    Hu, Yuanyuan
    Du, Kefeng
    Guan, Jinchao
    Yordanov, Ventseslav
    INTERNATIONAL JOURNAL OF PAVEMENT ENGINEERING, 2024, 25 (01)