Classification of Multi Speaker Shouted Speech and Single Speaker Normal Speech

被引：0

作者：

Baghel, Shikha ^{[1
]}

Prasanna, S. R. Mahadeva ^{[1
]}

Guha, Prithwijit ^{[1
]}

机构：

[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, Assam, India

来源：

TENCON 2017 - 2017 IEEE REGION 10 CONFERENCE | 2017年

关键词：

Shouted / normal speech classification; Source features; spectral features; SVM;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This work proposes a method for the shouted and multi speaker's vs normal and single speaker's speech classification, which is the most frequently occurring scenario in news debates. In this work, multi speaker shouted and single speaker normal speech classes are addressed as shouted and normal speech, respectively. Spectral features and source features are explored for the classification task. The source characteristics are studied in terms of strength of excitation (SoE). Spectral flux, spectral tilt, sum of ten largest spectral peaks (STLP), modulation spectrum energy (ModSE) and Mel frequency cepstral coefficients (MFCCs) are explored as the spectral features. Shouted and normal speech are classified using two approaches. In the first approach, these features, except MFCCs, are non-linearly mapped and combined using a threshold based technique. In the second approach, a predefined radial basis function (RBF) kernel based Support Vector Machine (SVM) classifier is used for the classification task on the extracted features. The performance evaluation is done in terms of F-Score. The performance is also evaluated on the basis of leave one out analysis to measure the strength of a particular feature for this task. By leave one out analysis, SoE is the most important feature among all one-dimensional features. When all the features are combined for classification, F-score of forty four dimensional feature is highest.

引用

页码：2388 / 2392

页数：5

共 50 条

[41] Accent classification from an emotional speech in clean and noisy environments
Priya Dharshini G
K Sreenivasa Rao
Multimedia Tools and Applications, 2023, 82 : 3485 - 3508
[42] Wavelet-based imagined speech classification using electroencephalography
Pawar, Dipti
Dhage, Sudhir
INTERNATIONAL JOURNAL OF BIOMEDICAL ENGINEERING AND TECHNOLOGY, 2022, 38 (03) : 215 - 224
[43] Classification of EEG Based Imagine Speech Using Time Domain Features
Paul, Yogesh
Jaswal, Ram Avtar
Kajal, Sanjay
2018 INTERNATIONAL CONFERENCE ON RECENT INNOVATIONS IN ELECTRICAL, ELECTRONICS & COMMUNICATION ENGINEERING (ICRIEECE 2018), 2018, : 2921 - 2924
[44] Efficient feature extraction and classification for the development of Pashto speech recognition system
Irfan Ahmed
Muhammad Abeer Irfan
Abid Iqbal
Amaad Khalil
Salman Ilahi Siddiqui
Multimedia Tools and Applications, 2024, 83 : 54081 - 54096
[45] Efficient feature extraction and classification for the development of Pashto speech recognition system
Ahmed, Irfan
Irfan, Muhammad Abeer
Iqbal, Abid
Khalil, Amaad
Siddiqui, Salman Ilahi
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (18) : 54081 - 54096
[46] Comparison between five classification techniques for classifying emotions in human speech
Pathak, Bageshree, V
Patil, Deepti R.
More, Shweta D.
Mhetre, Nikita R.
PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND CONTROL SYSTEMS (ICCS), 2019, : 201 - 207
[47] A Perspective Study on Speech Emotion Recognition: Databases, Features and Classification Models
Raghu, Kogila
Sadanandam, Manchala
TRAITEMENT DU SIGNAL, 2021, 38 (06) : 1861 - 1873
[48] Hybrid Transformer Architectures With Diverse Audio Features for Deepfake Speech Classification
Zaman, Khalid
Samiul, Islam J. A. M.
Sah, Melike
Direkoglu, Cem
Okada, Shogo
Unoki, Masashi
IEEE ACCESS, 2024, 12 : 149221 - 149237
[49] Multi-modal Emotion Recognition Based on Speech and Image
Li, Yongqiang
He, Qi
Zhao, Yongping
Yao, Hongxun
ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT I, 2018, 10735 : 844 - 853
[50] Identification of Speaker from Disguised Voice Using MFCC Feature Extraction, Chi-Square and Classification Technique
Singh, Mahesh K.
WIRELESS PERSONAL COMMUNICATIONS, 2024, 138 (02) : 973 - 987

← 1 2 3 4 5 →