Classification of Multi Speaker Shouted Speech and Single Speaker Normal Speech

被引:0
作者
Baghel, Shikha [1 ]
Prasanna, S. R. Mahadeva [1 ]
Guha, Prithwijit [1 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, Assam, India
来源
TENCON 2017 - 2017 IEEE REGION 10 CONFERENCE | 2017年
关键词
Shouted / normal speech classification; Source features; spectral features; SVM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This work proposes a method for the shouted and multi speaker's vs normal and single speaker's speech classification, which is the most frequently occurring scenario in news debates. In this work, multi speaker shouted and single speaker normal speech classes are addressed as shouted and normal speech, respectively. Spectral features and source features are explored for the classification task. The source characteristics are studied in terms of strength of excitation (SoE). Spectral flux, spectral tilt, sum of ten largest spectral peaks (STLP), modulation spectrum energy (ModSE) and Mel frequency cepstral coefficients (MFCCs) are explored as the spectral features. Shouted and normal speech are classified using two approaches. In the first approach, these features, except MFCCs, are non-linearly mapped and combined using a threshold based technique. In the second approach, a predefined radial basis function (RBF) kernel based Support Vector Machine (SVM) classifier is used for the classification task on the extracted features. The performance evaluation is done in terms of F-Score. The performance is also evaluated on the basis of leave one out analysis to measure the strength of a particular feature for this task. By leave one out analysis, SoE is the most important feature among all one-dimensional features. When all the features are combined for classification, F-score of forty four dimensional feature is highest.
引用
收藏
页码:2388 / 2392
页数:5
相关论文
共 50 条
  • [31] Classification of Emotions from Speech using Implicit Features
    Srivastava, Mohit
    Agarwal, Anupam
    2014 9TH INTERNATIONAL CONFERENCE ON INDUSTRIAL AND INFORMATION SYSTEMS (ICIIS), 2014, : 266 - 271
  • [32] Classification of Emotional Speech Units in Call Centre Interactions
    Galanis, Dimitrios
    Karabetsos, Sotiris
    Koutsombogera, Maria
    Papageorgiou, Harris
    Esposito, Anna
    Riviello, Maria-Teresa
    2013 IEEE 4TH INTERNATIONAL CONFERENCE ON COGNITIVE INFOCOMMUNICATIONS (COGINFOCOM), 2013, : 403 - 406
  • [33] Multistage classification scheme to enhance speech emotion recognition
    Poorna, S. S.
    Nair, G. J.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (02) : 327 - 340
  • [34] The command of comfort in an intelligent building by speech classification and image classification for energy optimization
    Ahmed, Henni Sid
    Caelen, Jean
    INTERNATIONAL JOURNAL ON SMART SENSING AND INTELLIGENT SYSTEMS, 2020, 13 (01): : 1 - 28
  • [35] Speech/Music Classification Using Features From Spectral Peaks
    Bhattacharjee, Mrinmoy
    Prasanna, S. R. Mahadeva
    Guha, Prithwijit
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 1549 - 1559
  • [36] Hilbert Domain Analysis of Wavelet Packets for Emotional Speech Classification
    Karan, Biswajit
    Kumar, Arvind
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2024, 43 (04) : 2224 - 2250
  • [37] Hilbert Domain Analysis of Wavelet Packets for Emotional Speech Classification
    Biswajit Karan
    Arvind Kumar
    Circuits, Systems, and Signal Processing, 2024, 43 : 2224 - 2250
  • [38] Indoor/Outdoor Audio Classification using Foreground Speech Segmentation
    Khonglah, Banriskhem K.
    Deepak, K. T.
    Prasanna, S. R. Mahadeva
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 464 - 468
  • [39] Random fourier feature based music-speech classification
    Vyshnav, M. T.
    Kumar, S. Sachin
    Mohan, Neethu
    Soman, K. P.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 38 (05) : 6353 - 6363
  • [40] Accent classification from an emotional speech in clean and noisy environments
    Dharshini, Priya G.
    Rao, K. Sreenivasa
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (03) : 3485 - 3508