Voice Conversion without Parallel Speech Corpus Based on Mixtures of Linear Transform

被引:1
作者
Jian, Zhi-Hua [1 ]
Yang, Zhen [1 ]
机构
[1] Nanjing Univ Post & Telecommun, Inst Signal Proc & Transmiss, Nanjing, Peoples R China
来源
2007 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, NETWORKING AND MOBILE COMPUTING, VOLS 1-15 | 2007年
关键词
Voice conversion; multimedia application; Ms-LT; EM algorithm;
D O I
10.1109/WICOM.2007.701
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper presents an algorithm for voice conversion based on mixtures of linear transform (Ms-LT) which avoids the need for parallel training data inherent in conventional approaches. In maximum likelihood framework, the EM algorithm is used to compute the parameters of the conversion function. And the chirp z-transform is utilized to enhance the averaged spectral envelop due to the linear weighting. The proposed voice conversion system is evaluated using both objective and subjective measures. The experimental results demonstrate that our approach is capable of effectively transforming speaker identity and can achieve comparable results of the conventional methods where a parallel corpus exists.
引用
收藏
页码:2825 / 2828
页数:4
相关论文
共 8 条
  • [1] Speaker Transformation Algorithm using Segmental Codebooks (STASC)
    Arslan, LM
    [J]. SPEECH COMMUNICATION, 1999, 28 (03) : 211 - 226
  • [2] Application of speech conversion to alaryngeal speech enhancement
    Bi, N
    Qi, YY
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1997, 5 (02): : 97 - 105
  • [3] Maximum-likelihood stochastic-transformation adaptation of hidden Markov models
    Diakoloukas, VD
    Digalakis, VV
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1999, 7 (02): : 177 - 187
  • [4] An approach to voice conversion using feature statistical mapping
    Hasan, MM
    Nasr, AM
    Sultana, S
    [J]. APPLIED ACOUSTICS, 2005, 66 (05) : 513 - 532
  • [5] Kain Alexander., 2001, HIGH RESOLUTION VOIC
  • [6] MOULINES E, 1995, SPEECH COMMUNICATION, V16
  • [7] TRANSFORMATION OF FORMANTS FOR VOICE CONVERSION USING ARTIFICIAL NEURAL NETWORKS
    NARENDRANATH, M
    MURTHY, HA
    RAJENDRAN, S
    YEGNANARAYANA, B
    [J]. SPEECH COMMUNICATION, 1995, 16 (02) : 207 - 216
  • [8] Continuous probabilistic transform for voice conversion
    Stylianou, Y
    Cappe, O
    Moulines, E
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 1998, 6 (02): : 131 - 142