A Modified Speaking Rate Estimation Based on Frame-Level LSTM

被引:0
作者
Xiao, Yanhong [1 ]
Du, Shixuan [1 ]
Xie, Xiang [1 ]
Wang, Jing [1 ]
Zhan, Qingran [1 ]
机构
[1] Beijing Inst Technol, Beijing, Peoples R China
来源
PROCEEDINGS OF 2018 14TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP) | 2018年
关键词
frame-level LSTM; speaking rate estimation; segmentation; SPEECH;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Speaking rate has various applications in many domains such as speech recognition, speaker verification, emotion recognition, etc. It conveys long-term information in speech and changes over time which can be seen as a kind of time sequence. This paper proposes a frame-level LSTM speaking rate estimation method. Instead of taking the whole utterance as a sequence, the frame-level LSTM exploits the sequence information in each segment and brings a more precise segmented speaking rate estimation. We also evaluate the influence of fixed-length segmentation and voice activity detection(vad) segmentation on speaking rate estimation. Results show that the proposed frame-level LSTM method yields a high correlation between the estimated speaking rate and the ground truth. It achieves a relative improvement of 13.0% compared to the state of the art statistical learning method and 16.3% over the support vector regression(SVR) evaluated on the same TIMIT corpus.
引用
收藏
页码:600 / 603
页数:4
相关论文
共 17 条
[11]  
Mirghafori N., 1995, 4 EUR C SPEECH COMM, P491
[12]  
Morgan N., 1997, EUR C SPEECH COMM TE
[13]  
PFAU T, 1998, ACOUST SPEECH SIG PR, P945
[14]  
Rozi A., 2017, SIGN INF PROC ASS SU, P1
[15]   EFFECTS OF SPEECH RATE ON PERSONALITY PERCEPTION [J].
SMITH, BL ;
BROWN, BL ;
STRONG, WJ ;
RENCHER, AC .
LANGUAGE AND SPEECH, 1975, 18 (APR-J) :145-152
[16]   The Effect of Rate Control on Speech Rate and Intelligibility of Dysarthric Speech [J].
Van Nuffelen, Gwen ;
De Bodt, Marc ;
Wuyts, Floris ;
Van de Heyning, Paul .
FOLIA PHONIATRICA ET LOGOPAEDICA, 2009, 61 (02) :69-75
[17]  
Xie Z., 2006, INTERSPEECH 2006