Vocal Effort Detection Based on Spectral Information Entropy Feature and Model Fusion

被引：4

作者：

Chao, Hao ^{[1
]}

Lu, Bao-Yun ^{[1
]}

Liu, Yong-Li ^{[1
]}

Zhi, Hui-Lai ^{[1
]}

机构：

[1] Henan Polytech Univ, Sch Comp Sci & Technol, Jiaozuo, Peoples R China

来源：

JOURNAL OF INFORMATION PROCESSING SYSTEMS | 2018年 / 14卷 / 01期

关键词：

Gaussian Mixture Model; Model Fusion; Multilayer Perceptron; Spectral Information Entropy; Support Vector Machine; Vocal Effort;

D O I：

10.3745/JIPS.04.0063

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Vocal effort detection is important for both robust speech recognition and speaker recognition. In this paper, the spectral information entropy feature which contains more salient information regarding the vocal effort level is firstly proposed. Then, the model fusion method based on complementary model is presented to recognize vocal effort level. Experiments are conducted on isolated words test set, and the results show the spectral information entropy has the best performance among the three kinds of features. Meanwhile, the recognition accuracy of all vocal effort levels reaches 81.6%. Thus, potential of the proposed method is demonstrated.

引用

页码：218 / 227

页数：10

共 16 条

[1] Brungart D.S., 2001, EUROSPEECH 2001, P747
[2] Carlin M. A., 2006, P 9 INT C SPOK LANG, P1
[3] Chang C. C., 2016, LIBSVM LIB SUPPORT V
[4] [晁浩 Chao Hao], 2016, [北京邮电大学学报, Journal of Beijing University of Posts Telecommunications], V39, P98
[5] Ghaffarzadegan Shabnam, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P2544, DOI 10.1109/ICASSP.2014.6854059
[6] Acoustic analysis of consonants in whispered speech
Jovicic, Slobodan T.
Saric, Zoran
[J]. JOURNAL OF VOICE, 2008, 22 (03) : 263 - 274
[7] Raitio T., 2013, P INT 13, P1544
[8] Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch
Saeidi, Rahim
Alku, Paavo
Baeckstroem, Tom
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (01) : 42 - 53
[9] Whispered Speech Detection in Noise Using Auditory-Inspired Modulation Spectrum Features
Sarria-Paja, Milton
Falk, Tiago H.
[J]. IEEE SIGNAL PROCESSING LETTERS, 2013, 20 (08) : 783 - 786
[10] Shriberg E, 2008, INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, P609

← 1 2 →