High-Individuality Voice Conversion Based on Concatenative Speech Synthesis

被引:0
|
作者
Fujii, Kei [1 ]
Okawa, Jun [1 ]
Suigetsu, Kaori [1 ]
机构
[1] Kumamoto Natl Coll Technol, Dept Informat & Comp Sci, Kohshi City, Kumamoto 8611102, Japan
来源
PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007 | 2007年 / 26卷
关键词
concatenative speech synthesis; join cost; speaker individuality; unit selection; voice conversion;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Concatenative speech synthesis is a method that can make speech sound which has naturalness and high-individuality of a speaker by introducing a large speech corpus. Based on this method, in this paper, we propose a voice conversion method whose conversion speech has high-individuality and naturalness. The authors also have two subjective evaluation experiments for evaluating individuality and sound quality of conversion speech. From the results, following three facts have be confirmed: (a) the proposal method can convert the individuality of speakers well, (b) employing the framework of unit selection (especially join cost) of concatenative speech synthesis into conventional voice conversion improves the sound quality of conversion speech, and (c) the proposal method is robust against the difference of genders between a source speaker and a target speaker.
引用
收藏
页码:483 / 488
页数:6
相关论文
共 50 条
  • [1] Syllable Based Concatenative Synthesis for Text to Speech Conversion
    Ananthi, S.
    Dhanalakshmi, P.
    COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 3, 2015, 33
  • [2] CUTE: A CONCATENATIVE METHOD FOR VOICE CONVERSION USING EXEMPLAR-BASED UNIT SELECTION
    Jin, Zeyu
    Finkelstein, Adam
    DiVerdi, Stephen
    Lu, Jingwan
    Mysore, Gautham J.
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5660 - 5664
  • [3] Spectral dynamics as a source of discontinuity in concatenative speech synthesis
    Kirkpatrick, Barry
    O'Brien, Darragh
    Scaife, Ronan
    Errity, Andrew
    PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 615 - +
  • [4] LSM-based unit pruning for concatenative speech synthesis
    Bellegarda, Jerome R.
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 521 - 524
  • [5] Discriminative training for concatenative speech synthesis
    Kim, NS
    Park, SS
    IEEE SIGNAL PROCESSING LETTERS, 2004, 11 (01) : 40 - 43
  • [6] IMPROVING VOICE QUALITY OF HMM-BASED SPEECH SYNTHESIS USING VOICE CONVERSION METHOD
    Jiao, Yishan
    Xie, Xiang
    Na, Xingyu
    Tu, Ming
    2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2014,
  • [7] Feature Extraction for Spectral Continuity Measures in Concatenative Speech Synthesis
    Kirkpatrick, Barry
    O'Brien, Darragh
    Scaife, Ronan
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1742 - 1745
  • [8] A Comparison of Voice Conversion Methods for Transforming Voice Quality in Emotional Speech Synthesis
    Tuerk, Oytun
    Schroeder, Marc
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 2282 - 2285
  • [9] Control of spectral dynamics in concatenative speech synthesis
    Wouters, J
    Macon, MW
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (01): : 30 - 38
  • [10] Speech Analysis/Synthesis by Gaussian Mixture Approximation of the Speech Spectrum for Voice Conversion
    Amini, Jamal
    Shahrebabaki, Abdoreza Sabzi
    Shokouhi, Navid
    Sheikhzadeh, Hamid
    Raahemifa, Kaamran
    Eslami, Mehdi
    2013 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (IEEE ISSPIT 2013), 2013, : 428 - 433