SINGAN: Singing Voice Conversion with Generative Adversarial Networks

被引:0
|
作者
Sisman, Berrak [1 ,2 ]
Vijayan, Karthika [1 ]
Dong, Minghui [2 ]
Li, Haizhou [1 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] ASTAR, Inst Infocomm Res, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
Singing voice conversion; generative adversarial networks; singing voice;
D O I
10.1109/apsipaasc47483.2019.9023162
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Singing voice conversion (SVC) is a task to convert the source singer's voice to sound like that of the target singer, without changing the lyrical content. So far, most of the voice conversion studies mainly focus only on the speech voice conversion that is different from singing voice conversion. We note that singing conveys both lexical and emotional information through words and tones. It is one of the most expressive components in music and a means of entertainment as well as self expression. In this paper, we propose a novel singing voice conversion framework, that is based on Generative Adversarial Networks (GANs). The proposed CAN-based conversion framework, that we call SINGAN, consists of two neural networks: a discriminator to distinguish natural and converted singing voice, and a generator to deceive the discriminator. With CAN, we minimize the differences of the distributions between the original target parameters and the generated singing parameters. To our best knowledge, this is the first framework that uses generative adversarial networks for singing voice conversion. In experiments, we show that the proposed method effectively converts singing voices and outperforms the baseline approach.
引用
收藏
页码:112 / 118
页数:7
相关论文
共 50 条
  • [21] Face Reconstruction from Voice using Generative Adversarial Networks
    Wen, Yandong
    Singh, Rita
    Raj, Bhiksha
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [22] Active Defense Against Voice Conversion Through Generative Adversarial Network
    Dong, Shihang
    Chen, Beijing
    Ma, Kaijie
    Zhao, Guoying
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 706 - 710
  • [23] IMPROVING ADVERSARIAL WAVEFORM GENERATION BASED SINGING VOICE CONVERSION WITH HARMONIC SIGNALS
    Guo, Haohan
    Zhou, Zhiping
    Meng, Fanbo
    Liu, Kai
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6657 - 6661
  • [24] ASGAN-VC: One-Shot Voice Conversion with Additional Style Embedding and Generative Adversarial Networks
    Li, WeiCheng
    Wei, Tzer-jen
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1932 - 1937
  • [25] A Survey on Generative Adversarial Networks based Models for Many-to-many Non-parallel Voice Conversion
    Alaa, Yasmin
    Alfonse, Marco
    Aref, Mostafa M.
    5TH INTERNATIONAL CONFERENCE ON COMPUTING AND INFORMATICS (ICCI 2022), 2022, : 221 - 226
  • [26] Xiaoicesing 2: A High-Fidelity Singing Voice Synthesizer Based on Generative Adversarial Network
    Wang, Chunhui
    Zeng, Chang
    He, Xing
    INTERSPEECH 2023, 2023, : 5401 - 5405
  • [27] SVCGAN: Speaker Voice Conversion Generative Adversarial Network for Children's Speech Conversion and Recognition
    Xie, Chenghuan
    Zhou, Aimin
    JOURNAL OF ELECTRICAL SYSTEMS, 2024, 20 (03) : 2182 - 2196
  • [28] Unsupervised Singing Voice Conversion
    Nachmani, Eliya
    Wolf, Lior
    INTERSPEECH 2019, 2019, : 2583 - 2587
  • [29] Non-parallel Many-to-many Singing Voice Conversion by Adversarial Learning
    Hu, Jinsen
    Yu, Chunyan
    Guan, Faqian
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 125 - 132
  • [30] Voice Conversion from Tibetan Amdo Dialect to Tibetan U-tsang Dialect Based on Generative Adversarial Networks
    Gan Zhenye
    Zhao Guangying
    Yang Hongwu
    Xing Xiaotian
    Jiao Yi
    PROCEEDINGS OF 2019 IEEE 8TH JOINT INTERNATIONAL INFORMATION TECHNOLOGY AND ARTIFICIAL INTELLIGENCE CONFERENCE (ITAIC 2019), 2019, : 325 - 329