Many-to-many Voice Conversion Based on Multiple Non-negative Matrix Factorization

被引:0
作者
Aihara, Ryo [1 ]
Takiguchi, Testuya [1 ]
Ariki, Yasuo [1 ]
机构
[1] Kobe Univ, Grad Sch Syst Informat, Nada Ku, 1-1 Rokkodai, Kobe, Hyogo, Japan
来源
16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5 | 2015年
关键词
voice conversion; speech synthesis; many-to-many; exemplar-based; NMF;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
We present in this paper an exemplar-based Voice Conversion (VC) method using Non-negative Matrix Factorization (NMF), which is different from conventional statistical VC. NMF-based VC has advantages of noise robustness and naturalness of converted voice compared to Gaussian Mixture Model (GMM)-based VC. However, because NMF-based VC is based on parallel training data of source and target speakers, we cannot convert the voice of arbitrary speakers in this framework. In this paper, we propose a many-to-many VC method that makes use of Multiple Non-negative Matrix Factorization (Multi-NMF). By using Multi-NMF, an arbitrary speaker's voice is converted to another arbitrary speaker's voice without the need for any input or output speaker training data. We assume that this method is flexible because we can adopt it to voice quality control or noise robust VC.
引用
收藏
页码:2749 / 2753
页数:5
相关论文
共 26 条
[1]  
Abe M., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), P655, DOI 10.1109/ICASSP.1988.196671
[2]  
Aihara R, 2014, P ICASSP, P7944
[3]   Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization [J].
Aihara, Ryo ;
Takashima, Ryoichi ;
Takiguchi, Tetsuya ;
Ariki, Yasuo .
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (06) :1411-1418
[4]   A preliminary demonstration of exemplar-based voice conversion for articulation disorders using an individuality-preserving dictionary [J].
Aihara, Ryo ;
Takashima, Ryoichi ;
Takiguchi, Tetsuya ;
Ariki, Yasuo .
EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2014,
[5]  
Akamine M., 2014, P INTERSPEECH, P2489
[6]  
[Anonymous], 2006, P INTERSPEECH
[7]  
[Anonymous], P INTERSPEECH
[8]  
[Anonymous], 2011, P INTERSPEECH
[9]   Exemplar-Based Sparse Representations for Noise Robust Automatic Speech Recognition [J].
Gemmeke, Jort F. ;
Virtanen, Tuomas ;
Hurmalainen, Antti .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2011, 19 (07) :2067-2080
[10]   Voice Conversion Using Partial Least Squares Regression [J].
Helander, Elina ;
Virtanen, Tuomas ;
Nurminen, Jani ;
Gabbouj, Moncef .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2010, 18 (05) :912-921