Classification of Multi Speaker Shouted Speech and Single Speaker Normal Speech

被引:0
|
作者
Baghel, Shikha [1 ]
Prasanna, S. R. Mahadeva [1 ]
Guha, Prithwijit [1 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, Assam, India
来源
TENCON 2017 - 2017 IEEE REGION 10 CONFERENCE | 2017年
关键词
Shouted / normal speech classification; Source features; spectral features; SVM;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This work proposes a method for the shouted and multi speaker's vs normal and single speaker's speech classification, which is the most frequently occurring scenario in news debates. In this work, multi speaker shouted and single speaker normal speech classes are addressed as shouted and normal speech, respectively. Spectral features and source features are explored for the classification task. The source characteristics are studied in terms of strength of excitation (SoE). Spectral flux, spectral tilt, sum of ten largest spectral peaks (STLP), modulation spectrum energy (ModSE) and Mel frequency cepstral coefficients (MFCCs) are explored as the spectral features. Shouted and normal speech are classified using two approaches. In the first approach, these features, except MFCCs, are non-linearly mapped and combined using a threshold based technique. In the second approach, a predefined radial basis function (RBF) kernel based Support Vector Machine (SVM) classifier is used for the classification task on the extracted features. The performance evaluation is done in terms of F-Score. The performance is also evaluated on the basis of leave one out analysis to measure the strength of a particular feature for this task. By leave one out analysis, SoE is the most important feature among all one-dimensional features. When all the features are combined for classification, F-score of forty four dimensional feature is highest.
引用
收藏
页码:2388 / 2392
页数:5
相关论文
共 50 条
  • [1] Single-speaker/multi-speaker co-channel speech classification
    Rossignol, Stephane
    Pietquini, Olivier
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2322 - 2325
  • [2] Shouted and whispered speech compensation for speaker verification systems
    Prieto, Santi
    Ortega, Alfonso
    Lopez-Espejo, Ivan
    Lleida, Eduardo
    DIGITAL SIGNAL PROCESSING, 2022, 127
  • [3] SPEAKER IDENTIFICATION FROM SHOUTED SPEECH: ANALYSIS AND COMPENSATION
    Hanilci, Cemal
    Kinnunen, Tomi
    Saeidi, Rahim
    Pohjalainen, Jouni
    Alku, Paavo
    Ertas, Figen
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 8027 - 8031
  • [4] THE SPEECH OF THE DEAF AND OF THE NORMAL SPEAKER
    Bodycomb, Margaret
    VOLTA REVIEW, 1946, 48 (11) : 637 - 638
  • [5] Shouted Speech Compensation for Speaker Verification Robust to Vocal Effort Conditions
    Prieto, Santi
    Ortega, Alfonso
    Lopez-Espejo, Ivan
    Lleida, Eduardo
    INTERSPEECH 2020, 2020, : 1511 - 1515
  • [6] NORMAL-TO-SHOUTED SPEECH SPECTRAL MAPPING FOR SPEAKER RECOGNITION UNDER VOCAL EFFORT MISMATCH
    Lopez, Ana Ramirez
    Saeidi, Rahim
    Juvela, Lauri
    Alku, Paavo
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 4940 - 4944
  • [7] Effect of High-Energy Voiced Speech Segments and Speaker Gender on Shouted Speech Detection
    Baghel, Shikha
    Prasanna, S. R. M.
    Guha, Prithwijit
    2021 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2021, : 53 - 58
  • [8] Speaker Clustering with Penalty Distance for Speaker Verification with Multi-Speaker Speech
    Das, Rohan Kumar
    Yang, Jichen
    Li, Haizhou
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 1630 - 1635
  • [9] Shouted / Normal Speech Classification using Speech-Specific Features
    Baghel, Shikha
    Khonglah, Banriskhem K.
    Prasanna, S. R. Mahadeva
    Guha, Prithwijit
    PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), 2016, : 1655 - 1659
  • [10] Unsupervised classification of speaker roles in multi-participant conversational speech
    Li, Yanxiong
    Wang, Qin
    Zhang, Xue
    Li, Wei
    Li, Xinchao
    Yang, Jichen
    Feng, Xiaohui
    Huang, Qian
    He, Qianhua
    COMPUTER SPEECH AND LANGUAGE, 2017, 42 : 81 - 99