Speech Analysis/Synthesis by Gaussian Mixture Approximation of the Speech Spectrum for Voice Conversion

被引:0
|
作者
Amini, Jamal [1 ]
Shahrebabaki, Abdoreza Sabzi [1 ]
Shokouhi, Navid [1 ]
Sheikhzadeh, Hamid [1 ]
Raahemifa, Kaamran [2 ]
Eslami, Mehdi [1 ]
机构
[1] Amirkabir Univ Technol, Dept Elect Engn, Tehran, Iran
[2] Ryerson Univ, Dept Elect & Comp Engn, Toronto, ON M5B 2K3, Canada
来源
2013 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (IEEE ISSPIT 2013) | 2013年
关键词
Analysis/Synthesis; Feature Extraction; Voice Conversion; GMM; STRAIGHT; FREQUENCY;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Voice conversion typically employs spectral features to convert a source voice to a target voice. In this paper, we propose a simple method of fitting the STRAIGHT spectrum with Gaussian mixture (GM) models for speech analysis/synthesis and spectral modification. The mean values of the Gaussians are pre-determined based on Mel-frequency spacing. The standard deviations are also adaptively adjusted using the constant-Q principle and the spectrum amplitudes. Finally, the weights of the Gaussians are determined by sampling the log-spectrum at Mel-frequencies. The proposed analysis/synthesis method (MFLS-GM) is employed for speech analysis/synthesis and voice conversion. Subjective evaluations employing MOS and ABX demonstrate superior performance of the voice conversion using the MFLS-GM compared to systems employing MFCC features. The computation cost of the proposed analysis/synthesis method is also much lower than those based on MFCC.
引用
收藏
页码:428 / 433
页数:6
相关论文
共 50 条
  • [1] Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models
    Doi, Hironori
    Nakamura, Keigo
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (09): : 2472 - 2482
  • [2] Age Approximation from Speech using Gaussian Mixture Models
    Mittal, Tanushri
    Barthwal, Anurag
    Koolagudi, Shashidhar G.
    2013 SECOND INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING, NETWORKING AND SECURITY (ADCONS 2013), 2013, : 74 - 78
  • [3] A Comparison of Voice Conversion Methods for Transforming Voice Quality in Emotional Speech Synthesis
    Tuerk, Oytun
    Schroeder, Marc
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2282 - 2285
  • [4] HMM adaptation and voice conversion for the synthesis of child speech: a comparison
    Watts, Oliver
    Yamagishi, Junichi
    King, Simon
    Berkling, Kay
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2595 - +
  • [5] Quantized F0 Context and Its Applications to Speech Synthesis, Speech Coding and Voice Conversion
    Nose, Takashi
    Kobayashi, Takao
    2014 TENTH INTERNATIONAL CONFERENCE ON INTELLIGENT INFORMATION HIDING AND MULTIMEDIA SIGNAL PROCESSING (IIH-MSP 2014), 2014, : 578 - 581
  • [6] Speech Enhancement for Automatic Speech Recognition Using Complex Gaussian Mixture Priors for Noise and Speech
    Astudillo, Ramon F.
    Hoffmann, Eugen
    Mandelartz, Philipp
    Orglmeister, Reinhold
    ADVANCES IN NONLINEAR SPEECH PROCESSING, 2010, 5933 : 60 - 67
  • [7] High-Individuality Voice Conversion Based on Concatenative Speech Synthesis
    Fujii, Kei
    Okawa, Jun
    Suigetsu, Kaori
    PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007, 2007, 26 : 483 - 488
  • [8] Evaluation of Expressive Speech Synthesis With Voice Conversion and Copy Resynthesis Techniques
    Turk, Oytun
    Schroeder, Marc
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05): : 965 - 973
  • [9] Iteratively Improving Speech Recognition and Voice Conversion
    Singh, Mayank Kumar
    Takahashi, Naoya
    Onoe, Naoyuki
    INTERSPEECH 2023, 2023, : 206 - 210
  • [10] Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation
    Liu, Zhonghua
    Wang, Shijun
    Chen, Ning
    INTERSPEECH 2023, 2023, : 2298 - 2302