VOICE CONVERSION USING ARTIFICIAL NEURAL NETWORKS

被引:140
作者
Desai, Srinivas [1 ]
Raghavendra, E. Veera [1 ]
Yegnanarayana, B. [1 ]
Black, Alan W. [2 ]
Prahallad, Kishore [1 ,2 ]
机构
[1] Int Inst Informat Technol, Hyderabad, Andhra Pradesh, India
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
来源
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年
关键词
Voice conversion; Artificial Neural Networks; Gaussian Mixture Model;
D O I
10.1109/ICASSP.2009.4960478
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose to use Artificial Neural Networks (ANN) for voice conversion. We have exploited the mapping abilities of ANN to perform mapping of spectral features of a source speaker to that of a target speaker. A comparative study of voice conversion using ANN and the state-of-the-art Gaussian Mixture Model (GMM) is conducted. The results of voice conversion evaluated using subjective and objective measures confirm that ANNs perform better transformation than GMMs and the quality of the transformed speech is intelligible and has the characteristics of the target speaker.
引用
收藏
页码:3893 / +
页数:2
相关论文
共 16 条
[1]  
ABE M, 1988, INT C AC SPEECH SIGN, V1
[2]  
[Anonymous], 5 ISCA WORKSH SPEECH
[3]  
IMAI S, 1983, INT C AC SPEECH SIGN
[4]  
Kain A, 2001, INT CONF ACOUST SPEE, P813, DOI 10.1109/ICASSP.2001.941039
[5]  
LIU K, 2007, 4 INT C FUZZ SYST KN
[6]   TRANSFORMATION OF FORMANTS FOR VOICE CONVERSION USING ARTIFICIAL NEURAL NETWORKS [J].
NARENDRANATH, M ;
MURTHY, HA ;
RAJENDRAN, S ;
YEGNANARAYANA, B .
SPEECH COMMUNICATION, 1995, 16 (02) :207-216
[7]  
STYLIANOU Y, 1995, STAT METHODS VOICE Q, P447
[8]  
TODA T, 2005, INT C AC SPEECH SIGN, V1
[9]  
TODA T, 2004, ACOUSTIC TO ARTICULA
[10]  
Toda T., 2004, 5 ISCA SPEECH SYNTHE, P31