Multi-class SVM for stressed speech recognition

被引：0

作者：

Besbes, Salsabil ^{[1
]}

Lachiri, Lied ^{[2
]}

机构：

[1] Univ Tunis El Manar, Natl Sch Engineers Tunis, Signal Image & Informat Technol Lab, BP 37 Le Belvdre, Tunis 1002, Tunisia

[2] Univ Tunis El Manar, Natl Sch Engineers Tunis, BP 37 Le Belvdre, Tunis 1002, Tunisia

来源：

2016 2ND INTERNATIONAL CONFERENCE ON ADVANCED TECHNOLOGIES FOR SIGNAL AND IMAGE PROCESSING (ATSIP) | 2016年

关键词：

speech recognition; multi-class support vector machines; stressed context; SUSAS database; GFCC;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper deals with a new automatic stressed recognition system based on kernel classification. We extracted advanced acoustic features from the stressed signals and employed a multi-class Support Vector Machines with different kernels to recognize speech utterances under stress. Gammatone Frequency Cepstral Coefficients are also established. The system implemented is tested using isolated words from SUSAS database with 4 classes: Neutral, Angry, Lombard and Loud. Experimental results show that the best performance is obtained when we use the auditory feature with different descriptors combination but it depends on the type of the kernel used.

引用

页码：782 / 787

页数：6

共 50 条

[41] An improved SVM using predator prey optimization and Hooke-Jeeves method for speech recognition [J].

Mittal, Teena ;

Sharma, R. K. .

JOURNAL OF ENGINEERING RESEARCH, 2016, 4 (01) :2-20

[42] Wavelet Packet Energy and Entropy Features for Classification of Stressed Speech [J].

Besbes, Salsabil ;

Lachiri, Zied .

2016 17TH INTERNATIONAL CONFERENCE ON SCIENCES AND TECHNIQUES OF AUTOMATIC CONTROL AND COMPUTER ENGINEERING (STA'2016), 2016, :98-103

[43] Multi-channel sub-band speech recognition [J].

McCowan I.A. ;

Sridharan S. .

EURASIP Journal on Advances in Signal Processing, 2001 (1) :45-52

[44] Towards multi-task learning of speech and speaker recognition [J].

Vaessen, Nik ;

van Leeuwen, David A. .

INTERSPEECH 2023, 2023, :4898-4902

[45] A MULTI-GENRE URDU BROADCAST SPEECH RECOGNITION SYSTEM [J].

Khan, Erbaz ;

Rauf, Sahar ;

Adeeba, Farah ;

Hussain, Sarmad .

2021 24TH CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (O-COCOSDA), 2021, :25-30

[46] Speech Recognition and Multi-Speaker Diarization of Long Conversations [J].

Mao, Huanru Henry ;

Li, Shuyang ;

McAuley, Julian ;

Cottrell, Garrison W. .

INTERSPEECH 2020, 2020, :691-695

[47] Multi-Head State Space Model for Speech Recognition [J].

Fathullah, Yassir ;

Wu, Chunyang ;

Shangguan, Yuan ;

Jia, Junteng ;

Xiong, Wenhan ;

Mahadeokar, Jay ;

Liu, Chunxi ;

Shi, Yangyang ;

Kalinli, Ozlem ;

Seltzer, Mike ;

Gales, Mark J. F. .

INTERSPEECH 2023, 2023, :241-245

[48] Research on Multi - base Depth Neural Network Speech Recognition [J].

Jun, Cai ;

Fei, Li ;

Yi, Zhang ;

LinYu .

2017 IEEE 2ND ADVANCED INFORMATION TECHNOLOGY, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IAEAC), 2017, :1540-1544

[49] Improving Speech Recognition Accuracy with Multi-Confidence Thresholding [J].

Chang, Shuangyu .

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, :1610-1613

[50] Multi-Stride Self-Attention for Speech Recognition [J].

Han, Kyu J. ;

Huang, Jing ;

Tang, Yun ;

He, Xiaodong ;

Zhou, Bowen .

INTERSPEECH 2019, 2019, :2788-2792

← 1 2 3 4 5 →