Multi-class SVM for stressed speech recognition

被引:0
作者
Besbes, Salsabil [1 ]
Lachiri, Lied [2 ]
机构
[1] Univ Tunis El Manar, Natl Sch Engineers Tunis, Signal Image & Informat Technol Lab, BP 37 Le Belvdre, Tunis 1002, Tunisia
[2] Univ Tunis El Manar, Natl Sch Engineers Tunis, BP 37 Le Belvdre, Tunis 1002, Tunisia
来源
2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP) | 2016年
关键词
speech recognition; multi-class support vector machines; stressed context; SUSAS database; GFCC;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper deals with a new automatic stressed recognition system based on kernel classification. We extracted advanced acoustic features from the stressed signals and employed a multi-class Support Vector Machines with different kernels to recognize speech utterances under stress. Gammatone Frequency Cepstral Coefficients are also established. The system implemented is tested using isolated words from SUSAS database with 4 classes: Neutral, Angry, Lombard and Loud. Experimental results show that the best performance is obtained when we use the auditory feature with different descriptors combination but it depends on the type of the kernel used.
引用
收藏
页码:782 / 787
页数:6
相关论文
共 50 条
[21]   Stressed speech recognition method based on difference subspace combined with dynamic time warping [J].
Lv, Chengguo ;
Zhang, Rubo ;
Li, Peihua .
INDUSTRIAL INSTRUMENTATION AND CONTROL SYSTEMS, PTS 1-4, 2013, 241-244 :1640-+
[22]   IMPROVING SPEECH RECOGNITION ON NOISY SPEECH VIA SPEECH ENHANCEMENT WITH MULTI-DISCRIMINATORS CYCLEGAN [J].
Li, Chia-Yu ;
Ngoc Thang Vu .
2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, :830-836
[23]   The Multi-level Approach to Speech Corpora Annotation for Automatic Speech Recognition [J].
Glavatskih, Igor ;
Platonova, Tatyana ;
Rogozhina, Valeria ;
Shirokova, Anna ;
Smolina, Anna ;
Kotov, Mikhail ;
Ovsyannikova, Anna ;
Repalov, Sergey ;
Zulkarneev, Mikhail .
SPEECH AND COMPUTER (SPECOM 2015), 2015, 9319 :438-445
[24]   A hybrid SVM/DDBHMM decision fusion modeling for robust continuous digital speech recognition [J].
Liu, Jingwei ;
Wang, Zuoying ;
Mao, Xi .
PATTERN RECOGNITION LETTERS, 2007, 28 (08) :912-920
[25]   Multi-Channel Transformer Transducer for Speech Recognition [J].
Chang, Feng-Ju ;
Radfar, Martin ;
Mouchtaris, Athanasios ;
Omologo, Maurizio .
INTERSPEECH 2021, 2021, :296-300
[26]   Multi-rate HMM quantization for speech recognition [J].
Vasilache, Marcel .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :4341-4344
[27]   Multi-stream parameterization for structural speech recognition [J].
Asakawa, Satoshi ;
Minematsu, Nobuaki ;
Hirose, Keikichi .
2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, :4097-+
[28]   Multi resolution discriminative models for subvocalic speech recognition [J].
Raugas, Mark ;
Sridhar, Vivek Kumar Rangarajan ;
Prasad, Rohit ;
Natarajan, Prem .
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, :2634-2637
[29]   Multi-agent based Arabic speech recognition [J].
Taha, Muhammad ;
Helmy, Tarek ;
Alez, Reda Abo .
PROCEEDING OF THE 2007 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY, WORKSHOPS, 2007, :433-+
[30]   Multi-Dialectical Languages Effect on Speech Recognition [J].
Elfeky, Mohamed G. ;
Moreno, Pedro ;
Soto, Victor .
1ST INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE AND SPEECH PROCESSING, 2018, 128 :1-8