Classification of Multi Speaker Shouted Speech and Single Speaker Normal Speech

被引：0

作者：

Baghel, Shikha ^{[1
]}

Prasanna, S. R. Mahadeva ^{[1
]}

Guha, Prithwijit ^{[1
]}

机构：

[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, Assam, India

来源：

TENCON 2017 - 2017 IEEE REGION 10 CONFERENCE | 2017年

关键词：

Shouted / normal speech classification; Source features; spectral features; SVM;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This work proposes a method for the shouted and multi speaker's vs normal and single speaker's speech classification, which is the most frequently occurring scenario in news debates. In this work, multi speaker shouted and single speaker normal speech classes are addressed as shouted and normal speech, respectively. Spectral features and source features are explored for the classification task. The source characteristics are studied in terms of strength of excitation (SoE). Spectral flux, spectral tilt, sum of ten largest spectral peaks (STLP), modulation spectrum energy (ModSE) and Mel frequency cepstral coefficients (MFCCs) are explored as the spectral features. Shouted and normal speech are classified using two approaches. In the first approach, these features, except MFCCs, are non-linearly mapped and combined using a threshold based technique. In the second approach, a predefined radial basis function (RBF) kernel based Support Vector Machine (SVM) classifier is used for the classification task on the extracted features. The performance evaluation is done in terms of F-Score. The performance is also evaluated on the basis of leave one out analysis to measure the strength of a particular feature for this task. By leave one out analysis, SoE is the most important feature among all one-dimensional features. When all the features are combined for classification, F-score of forty four dimensional feature is highest.

引用

页码：2388 / 2392

页数：5

共 50 条

[1] Single-speaker/multi-speaker co-channel speech classification
Rossignol, Stephane
Pietquini, Olivier
11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2322 - 2325
[2] Shouted and whispered speech compensation for speaker verification systems
Prieto, Santi
Ortega, Alfonso
Lopez-Espejo, Ivan
Lleida, Eduardo
DIGITAL SIGNAL PROCESSING, 2022, 127
[3] SPEAKER IDENTIFICATION FROM SHOUTED SPEECH: ANALYSIS AND COMPENSATION
Hanilci, Cemal
Kinnunen, Tomi
Saeidi, Rahim
Pohjalainen, Jouni
Alku, Paavo
Ertas, Figen
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8027 - 8031
[4] THE SPEECH OF THE DEAF AND OF THE NORMAL SPEAKER
Bodycomb, Margaret
VOLTA REVIEW, 1946, 48 (11) : 637 - 638
[5] Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions
Prieto, Santi
Ortega, Alfonso
Lopez-Espejo, Ivan
Lleida, Eduardo
INTERSPEECH 2020, 2020, : 1511 - 1515
[6] NORMAL-TO-SHOUTED SPEECH SPECTRAL MAPPING FOR SPEAKER RECOGNITION UNDER VOCAL EFFORT MISMATCH
Lopez, Ana Ramirez
Saeidi, Rahim
Juvela, Lauri
Alku, Paavo
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4940 - 4944
[7] Effect of High-Energy Voiced Speech Segments and Speaker Gender on Shouted Speech Detection
Baghel, Shikha
Prasanna, S. R. M.
Guha, Prithwijit
2021 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2021, : 53 - 58
[8] Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech
Das, Rohan Kumar
Yang, Jichen
Li, Haizhou
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1630 - 1635
[9] Shouted / Normal Speech Classification using Speech-Specific Features
Baghel, Shikha
Khonglah, Banriskhem K.
Prasanna, S. R. Mahadeva
Guha, Prithwijit
PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 1655 - 1659
[10] Unsupervised classification of speaker roles in multi-participant conversational speech
Li, Yanxiong
Wang, Qin
Zhang, Xue
Li, Wei
Li, Xinchao
Yang, Jichen
Feng, Xiaohui
Huang, Qian
He, Qianhua
COMPUTER SPEECH AND LANGUAGE, 2017, 42 : 81 - 99

← 1 2 3 4 5 →