High-Individuality Voice Conversion Based on Concatenative Speech Synthesis

被引：0

作者：

Fujii, Kei ^{[1
]}

Okawa, Jun ^{[1
]}

Suigetsu, Kaori ^{[1
]}

机构：

[1] Kumamoto Natl Coll Technol, Dept Informat & Comp Sci, Kohshi City, Kumamoto 8611102, Japan

来源：

PROCEEDINGS OF WORLD ACADEMY OF SCIENCE, ENGINEERING AND TECHNOLOGY, VOL 26, PARTS 1 AND 2, DECEMBER 2007 | 2007年 / 26卷

关键词：

concatenative speech synthesis; join cost; speaker individuality; unit selection; voice conversion;

D O I：

暂无

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Concatenative speech synthesis is a method that can make speech sound which has naturalness and high-individuality of a speaker by introducing a large speech corpus. Based on this method, in this paper, we propose a voice conversion method whose conversion speech has high-individuality and naturalness. The authors also have two subjective evaluation experiments for evaluating individuality and sound quality of conversion speech. From the results, following three facts have be confirmed: (a) the proposal method can convert the individuality of speakers well, (b) employing the framework of unit selection (especially join cost) of concatenative speech synthesis into conventional voice conversion improves the sound quality of conversion speech, and (c) the proposal method is robust against the difference of genders between a source speaker and a target speaker.

引用

页码：483 / 488

页数：6

共 50 条

[41] A preliminary demonstration of exemplar-based voice conversion for articulation disorders using an individuality-preserving dictionary
Aihara, Ryo
Takashima, Ryoichi
Takiguchi, Tetsuya
Ariki, Yasuo
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
[42] A preliminary demonstration of exemplar-based voice conversion for articulation disorders using an individuality-preserving dictionary
Ryo Aihara
Ryoichi Takashima
Tetsuya Takiguchi
Yasuo Ariki
EURASIP Journal on Audio, Speech, and Music Processing, 2014
[43] VOICE CONVERSION FOR VARIOUS TYPES OF BODY TRANSMITTED SPEECH
Toda, Tomoki
Nakamura, Keigo
Sekimoto, Hidehiko
Shikano, Kiyohiro
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS, 2009, : 3601 - 3604
[44] A Study of Speech Phase in Dysarthria Voice Conversion System
Chen, Ko-Chiang
Han, Ji-Yan
Jhang, Sin-Hua
Lai, Ying-Hui
FUTURE TRENDS IN BIOMEDICAL AND HEALTH INFORMATICS AND CYBERSECURITY IN MEDICAL DEVICES, ICBHI 2019, 2020, 74 : 219 - 226
[45] ON USING BACKPROPAGATION FOR SPEECH TEXTURE GENERATION AND VOICE CONVERSION
Chorowski, Jan
Weiss, Ron J.
Saurous, Rif A.
Bengio, Samy
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 2256 - 2260
[46] Voice Conversion for Improving Perceived Likability of Uttered Speech
Horiike, Shinya
Morise, Masanori
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2020, E103D (05): : 1199 - 1202
[47] Multi-MelGAN Voice Conversion for the Creation of Under-Resourced Child Speech Synthesis
Govender, Avashna
Paul, Dipjyoti
2022 IST-AFRICA CONFERENCE, 2022,
[48] Runtime and Speech Quality Survey of a Voice Conversion Method
Jokisch, Oliver
Birhanu, Yitagessu
Hoffmann, Ruediger
2013 IEEE EUROCON, 2013, : 1684 - 1688
[49] Non-parallel Voice Conversion with Controllable Speaker Individuality using Variational Autoencoder
Tuan Vu Ho
Akagi, Masato
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 106 - 111
[50] Voice Conversion for Enhancing Mandarin Electro-Laryngeal Speech Based on Semantic Information
Qian Z.-P.
Xiao K.-J.
Liu C.
Sun Y.
Qian, Zhao-Peng (qianzhaopeng@buaa.edu.cn), 2020, Chinese Institute of Electronics (48): : 840 - 845

← 1 2 3 4 5 →