Multi-Resolution Feature Extraction Algorithm in Emotional Speech Recognition

被引:2
作者
Zelenik, Ales [1 ]
Kacic, Zdravko [2 ]
机构
[1] NXP Semicond Gratkorn GmbH, A-8101 Gratkorn, Austria
[2] Fac Elect Engn & Comp Sci, Maribor 2000, Slovenia
关键词
Speech; emotion recognition; segmentation; multi-resolution;
D O I
10.5755/j01.eee.21.5.13328
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper a new approach for recognizing emotional speech from audio recordings is presented. In order to obtain the optimum processing window width for feature extraction and to achieve the highest level of recognition rates, a trade-off between time and frequency resolution must be made. At this point, we define a new procedure that combines the advantages of narrower and wider windows and takes advantage of dynamic adjustment of the time and frequency resolution of individual feature characteristics. To achieve higher recognition rates two major procedures are added to the multi-resolution feature-extraction concept, one being the exclusion of features calculated on different processing window widths and the other the idea to use only the parts of recordings with most explicit emotions. To confirm the benefits of the algorithm the audio recordings from the emotional speech database Interface along with four different classifiers were used in evaluation. The highest level of emotion recognition rate with multi-resolution approach exceeded the recognition rate of the best single-resolution approach by 3.5 % with the average improvement of 1.5 % in absolute terms.
引用
收藏
页码:54 / 58
页数:5
相关论文
共 50 条
[21]   Image Inpainting via Correlated Multi-Resolution Feature Projection [J].
Phutke, Shruti S. ;
Murala, Subrahmanyam .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (09) :5953-5964
[22]   Partial discharge feature extraction based on multi-resolution analysis of higher-order singular spectrum entropy [J].
Yang F. ;
Song H. ;
Cheng X. ;
Gao Z. ;
Tao S. ;
Duan D. ;
Sheng G. ;
Jiang X. .
Dianwang Jishu/Power System Technology, 2016, 40 (10) :3265-3271
[23]   Facial Emotion Recognition Using Different Multi-resolution Transforms [J].
Verma, Gyanendra K. ;
Tiwary, U. S. ;
Rai, Mahendra K. .
ADVANCES IN COMPUTING AND COMMUNICATIONS, PT III, 2011, 192 :469-+
[24]   Deep Learning of Fuzzy Weighted Multi-Resolution Depth Motion Maps with Spatial Feature Fusion for Action Recognition [J].
Al-Faris, Mahmoud ;
Chiverton, John ;
Yang, Yanyan ;
Ndzi, David .
JOURNAL OF IMAGING, 2019, 5 (10)
[25]   Gait Recognition Based on Multi-Resolution Regional Shape Context [J].
Zhai, Yanbo ;
Jia, Yulan ;
Qi, Chun .
PROCEEDINGS OF THE 2009 CHINESE CONFERENCE ON PATTERN RECOGNITION AND THE FIRST CJK JOINT WORKSHOP ON PATTERN RECOGNITION, VOLS 1 AND 2, 2009, :548-552
[26]   Multi-Resolution Analysis by Empirical Mode Decomposition for Usable Speech Detection [J].
Ghezaiel, Wajdi ;
Ben Slimane, Amel ;
Ben Braiek, Ezzedine .
2013 10TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2013,
[27]   Feature-based multi-resolution registration of immunostained serial sections [J].
Lobachev, Oleg ;
Ulrich, Christine ;
Steiniger, Birte S. ;
Wilhelmi, Verena ;
Stachniss, Vitus ;
Guthe, Michael .
MEDICAL IMAGE ANALYSIS, 2017, 35 :288-302
[28]   Feature extraction algorithms to improve the speech emotion recognition rate [J].
Koduru, Anusha ;
Valiveti, Hima Bindu ;
Budati, Anil Kumar .
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (01) :45-55
[29]   Feature extraction algorithms to improve the speech emotion recognition rate [J].
Anusha Koduru ;
Hima Bindu Valiveti ;
Anil Kumar Budati .
International Journal of Speech Technology, 2020, 23 :45-55
[30]   The Extraction Method of Emotional Feature Based on Children's Spoken Speech [J].
Zheng, Chunjun ;
Jia, Ning ;
Sun, Wei .
2019 11TH INTERNATIONAL CONFERENCE ON INTELLIGENT HUMAN-MACHINE SYSTEMS AND CYBERNETICS (IHMSC 2019), VOL 1, 2019, :165-168