SPEAKER AND LANGUAGE INDEPENDENT VOICE QUALITY CLASSIFICATION APPLIED TO UNLABELLED CORPORA OF EXPRESSIVE SPEECH

被引:0
作者
Kane, John [1 ]
Scherer, Stefan
Aylett, Matthew
Morency, Louis-Philippe
Gobl, Christer [1 ]
机构
[1] Trinity Coll Dublin, Sch Linguist Speech & Commun Sci, Dublin, Ireland
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
基金
爱尔兰科学基金会;
关键词
Voice quality; glottal source; speech synthesis; expressive speech; audiobooks; GLOTTAL CLOSURE INSTANT; RECOGNITION; AMPLITUDE; EMOTION;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Voice quality plays a pivotal role in speech style variation. Therefore, control and analysis of voice quality is critical for many areas of speech technology. Until now, most work has focused on small purpose built corpora. In this paper we apply state-of-the-art voice quality analysis to large speech corpora built for expressive speech synthesis. A fuzzy-input fuzzy-output support vector machine classifier is trained and validated using features extracted from these corpora. We then apply this classifier to freely available audiobook data and demonstrate a clustering of the voice qualities that approximates the performance of human perceptual ratings. The ability to detect voice quality variation in these widely available unlabelled audiobook corpora means that the proposed method may be used as a valuable resource in expressive speech synthesis.
引用
收藏
页码:7982 / 7986
页数:5
相关论文
共 31 条
[1]   Normalized amplitude quotient for parametrization of the glottal flow [J].
Alku, P ;
Bäckström, T ;
Vilkman, E .
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 2002, 112 (02) :701-710
[2]   GLOTTAL WAVE ANALYSIS WITH PITCH SYNCHRONOUS ITERATIVE ADAPTIVE INVERSE FILTERING [J].
ALKU, P .
SPEECH COMMUNICATION, 1992, 11 (2-3) :109-118
[3]  
[Anonymous], 2011, INTERSPEECH 2011, 12th Annual Conference of the International Speech Communication Association
[4]  
[Anonymous], VOCAL FOLD PHYSL VOC
[5]  
[Anonymous], PROC SPEECH PROSODY
[6]  
[Anonymous], 2000, ACM SIGKDD EXPLOR NE, DOI DOI 10.1145/380995.380999
[7]  
[Anonymous], 2002, Algorithms for Minimization Without Derivatives
[8]  
[Anonymous], 1980, The phonetic description of voice quality
[9]  
Aylett M. P., 2007, AISB, P174
[10]  
Aylett M. P., ICASSP13 UNPUB