VOICE CONVERSION USING ARTIFICIAL NEURAL NETWORKS

被引：140

作者：

Desai, Srinivas ^{[1
]}

Raghavendra, E. Veera ^{[1
]}

Yegnanarayana, B. ^{[1
]}

Black, Alan W. ^{[2
]}

Prahallad, Kishore ^{[1
,2
]}

机构：

[1] Int Inst Informat Technol, Hyderabad, Andhra Pradesh, India

[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA

来源：

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年

关键词：

Voice conversion; Artificial Neural Networks; Gaussian Mixture Model;

D O I：

10.1109/ICASSP.2009.4960478

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we propose to use Artificial Neural Networks (ANN) for voice conversion. We have exploited the mapping abilities of ANN to perform mapping of spectral features of a source speaker to that of a target speaker. A comparative study of voice conversion using ANN and the state-of-the-art Gaussian Mixture Model (GMM) is conducted. The results of voice conversion evaluated using subjective and objective measures confirm that ANNs perform better transformation than GMMs and the quality of the transformed speech is intelligible and has the characteristics of the target speaker.

引用

页码：3893 / +

页数：2

共 16 条

[1]

ABE M, 1988, INT C AC SPEECH SIGN, V1

[2]

[Anonymous], 5 ISCA WORKSH SPEECH

[3]

IMAI S, 1983, INT C AC SPEECH SIGN

[4]

Kain A, 2001, INT CONF ACOUST SPEE, P813, DOI 10.1109/ICASSP.2001.941039

[5]

LIU K, 2007, 4 INT C FUZZ SYST KN

[6] TRANSFORMATION OF FORMANTS FOR VOICE CONVERSION USING ARTIFICIAL NEURAL NETWORKS [J].

NARENDRANATH, M ;

MURTHY, HA ;

RAJENDRAN, S ;

YEGNANARAYANA, B .

SPEECH COMMUNICATION, 1995, 16 (02) :207-216

[7]

STYLIANOU Y, 1995, STAT METHODS VOICE Q, P447

[8]

TODA T, 2005, INT C AC SPEECH SIGN, V1

[9]

TODA T, 2004, ACOUSTIC TO ARTICULA

[10]

Toda T., 2004, 5 ISCA SPEECH SYNTHE, P31

← 1 2 →