Statistical parametric speech synthesis for Arabic language using ANN

被引:0
作者
Ilyes, Rebai [1 ]
BenAyed, Yassine [1 ]
机构
[1] Sfax Univ, MIRACL Multimedia InfoRmat Syst & Adv Comp Lab, Sfax, Tunisia
来源
2014 1ST INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP 2014) | 2014年
关键词
Statistical parametric; speech synthesis; neural networks;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Statistical parametric approach for speech synthesis becomes more popular over the concatenative approach due to the low size of the system and the high-quality speech. Moreover, few researches have been done in the field of speech synthesis for Arabic language with a poor quality of speech. In this paper, we propose a statistical parametric synthesis system for Arabic based on Artificial Neural Networks (ANN). Mel frequency Cepstral coefficients (MFCC), F0, energy and duration are the main components of our system. Speech waveform is generated from the predicted parameters F0, energy and MFCC. Different methods are proposed for this development process. In addition, we propose a method to solve the problem of discontinuities between neighboring segment boundaries in order to improve the speech quality. Experimental results of cepstral and prosodic parameters are given in this paper as well as the subjective evaluation.
引用
收藏
页码:452 / 457
页数:6
相关论文
共 23 条
[1]  
Abdel-Hamid O, 2006, INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, P1332
[2]  
Al-Said Ghadeer, 2009, Journal of Computer Sciences, V5, P207, DOI 10.3844/jcs.2009.207.213
[3]  
[Anonymous], 2010, P 2010 NAT C COMM NC, DOI DOI 10.1109/NCC.2010.5430190
[4]  
Attia M., 2005, THESIS FACULTY ENG C
[5]  
Bahaadini Sara, 2011, 2011 IEEE INT WORKSH, P1
[6]   Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech [J].
Barra-Chicote, Roberto ;
Yamagishi, Junichi ;
King, Simon ;
Manuel Montero, Juan ;
Macias-Guarasa, Javier .
SPEECH COMMUNICATION, 2010, 52 (05) :394-404
[7]   Towards a high quality Arabic speech synthesis system based on neural networks and residual excited vocal tract model [J].
Chouireb, Fatima ;
Guerti, Mhania .
SIGNAL IMAGE AND VIDEO PROCESSING, 2008, 2 (01) :73-87
[8]  
DIXON NR, 1968, IEEE T AUDIO ELECTRO, V16, P40
[9]  
Elmanfaloty Rania, 2012, 2012 International Conference on Computer and Communication Engineering (ICCCE), P734, DOI 10.1109/ICCCE.2012.6271314
[10]   Techniques for high quality Arabic speech synthesis [J].
Elshafei, M ;
Al-Muhtaseb, H ;
Al-Ghamdi, M .
INFORMATION SCIENCES, 2002, 140 (3-4) :255-267