Vocal Effort Detection Based on Spectral Information Entropy Feature and Model Fusion

被引:4
作者
Chao, Hao [1 ]
Lu, Bao-Yun [1 ]
Liu, Yong-Li [1 ]
Zhi, Hui-Lai [1 ]
机构
[1] Henan Polytech Univ, Sch Comp Sci & Technol, Jiaozuo, Peoples R China
来源
JOURNAL OF INFORMATION PROCESSING SYSTEMS | 2018年 / 14卷 / 01期
关键词
Gaussian Mixture Model; Model Fusion; Multilayer Perceptron; Spectral Information Entropy; Support Vector Machine; Vocal Effort;
D O I
10.3745/JIPS.04.0063
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Vocal effort detection is important for both robust speech recognition and speaker recognition. In this paper, the spectral information entropy feature which contains more salient information regarding the vocal effort level is firstly proposed. Then, the model fusion method based on complementary model is presented to recognize vocal effort level. Experiments are conducted on isolated words test set, and the results show the spectral information entropy has the best performance among the three kinds of features. Meanwhile, the recognition accuracy of all vocal effort levels reaches 81.6%. Thus, potential of the proposed method is demonstrated.
引用
收藏
页码:218 / 227
页数:10
相关论文
共 16 条
  • [1] Brungart D.S., 2001, EUROSPEECH 2001, P747
  • [2] Carlin M. A., 2006, P 9 INT C SPOK LANG, P1
  • [3] Chang C. C., 2016, LIBSVM LIB SUPPORT V
  • [4] [晁浩 Chao Hao], 2016, [北京邮电大学学报, Journal of Beijing University of Posts Telecommunications], V39, P98
  • [5] Ghaffarzadegan Shabnam, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P2544, DOI 10.1109/ICASSP.2014.6854059
  • [6] Acoustic analysis of consonants in whispered speech
    Jovicic, Slobodan T.
    Saric, Zoran
    [J]. JOURNAL OF VOICE, 2008, 22 (03) : 263 - 274
  • [7] Raitio T., 2013, P INT 13, P1544
  • [8] Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch
    Saeidi, Rahim
    Alku, Paavo
    Baeckstroem, Tom
    [J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (01) : 42 - 53
  • [9] Whispered Speech Detection in Noise Using Auditory-Inspired Modulation Spectrum Features
    Sarria-Paja, Milton
    Falk, Tiago H.
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (08) : 783 - 786
  • [10] Shriberg E, 2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, P609