A Survey on Generative Adversarial Networks based Models for Many-to-many Non-parallel Voice Conversion

被引:3
|
作者
Alaa, Yasmin [1 ]
Alfonse, Marco [1 ]
Aref, Mostafa M. [1 ]
机构
[1] Ain Shams Univ, Dept Comp Sci, Fac Comp & Informat Sci, Cairo, Egypt
关键词
Voice Conversion; many-to-many Voice Conversion; non-parallel Voice Conversion; mono-lingual Voice Conversion; Generative Adversarial Networks (GANs); StarGAN-VC; CycleGAN-VC; RECOGNITION;
D O I
10.1109/ICCI54321.2022.9756059
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Voice Conversion (VC) is a task of converting speaker-dependent features of a source speaker's speech without changing the linguistic content. There are many successful VC systems, each trying to overcome some challenges. These challenges include the unavailability of parallel data and solving problems due to the language difference between the source and target speech. Also, one of these challenges is extending the VC system to cover a conversion across many source and target domains with minimal cost. Generative Adversarial Networks (GANs) are showing promising VC results. This work focuses on exploring many-to-many non-parallel GAN-based mono-lingual VC models (nine models that are highly cited), explains the used evaluation methods including objective and subjective methods (eight evaluation methods are presented), and comments on these models.
引用
收藏
页码:221 / 226
页数:6
相关论文
共 50 条
  • [41] Effects of Sinusoidal Model on Non-Parallel Voice Conversion with Adversarial Learning
    Al-Radhi, Mohammed Salah
    Csapo, Tamas Gabor
    Nemeth, Geza
    APPLIED SCIENCES-BASEL, 2021, 11 (16):
  • [42] CycleGAN-VC: Non-parallel Voice Conversion Using Cycle-Consistent Adversarial Networks
    Kaneko, Takuhiro
    Kameoka, Hirokazu
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 2100 - 2104
  • [43] Choosing only the best voice imitators: Top-K many-to-many voice conversion with StarGAN
    Fernandez-Martin, Claudio
    Colomer, Adrian
    Panariello, Claudio
    Naranjo, Valery
    SPEECH COMMUNICATION, 2024, 156
  • [44] A many-to-many matching with externalities solution for parallel task offloading in IoT networks
    Malik, Usman Mahmood
    Javed, Muhammad Awais
    Almohimeed, Abdulaziz
    Alkhathami, Mohammed
    Alsadie, Deafallah
    Almujalli, Abeer
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (07)
  • [45] Non-Parallel Voice Conversion Using Cycle-Consistent Adversarial Networks with Self-Supervised Representations
    Chun, Chanjun
    Lee, Young Han
    Lee, Geon Woo
    Jeon, Moongu
    Kim, Hong Kook
    2023 IEEE 20TH CONSUMER COMMUNICATIONS & NETWORKING CONFERENCE, CCNC, 2023,
  • [46] NON-PARALLEL TRAINING FOR VOICE CONVERSION BASED ON ADAPTATION METHOD
    Song, Peng
    Zheng, Wenming
    Zhao, Li
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6905 - 6909
  • [47] A novel method for voice conversion based on non-parallel corpus
    Sayadian A.
    Mozaffari F.
    International Journal of Speech Technology, 2017, 20 (3) : 587 - 592
  • [48] Many-to-many Cross-lingual Voice Conversion with a Jointly Trained Speaker Embedding Network
    Zhou, Yi
    Tian, Xiaohai
    Das, Rohan Kumar
    Li, Haizhou
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1282 - 1287
  • [49] Asymmetric Cross-domain Transfer Learning of Person Re-identification Based on the Many-to-many Generative Adversarial Network
    Liang W.-Q.
    Wang G.-C.
    Lai J.-H.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (01): : 103 - 120
  • [50] SINGING VOICE CONVERSION WITH NON-PARALLEL DATA
    Chen, Xin
    Chu, Wei
    Guo, Jinxi
    Xu, Ning
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 292 - 296