Exploring the Role of Pitch-Adaptive Cepstral Features in Context of Children's Mismatched ASR

被引:0
|
作者
Sinha, Rohit [1 ]
Shahnawazuddin, S. [1 ]
Karthik, Patri Satya [1 ]
机构
[1] Indian Inst Technol Guwahati, Dept Elect & Elect Engn, Gauhati 781039, India
来源
2016 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM) | 2016年
关键词
Children's speech recognition; pitch-adaptive features; STRAIGHT-based MFCC; DNN; REPRESENTATIONS; RECOGNITION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The presented work explores the role of pitch-adaptive cepstral features in context of automatic speech recognition (ASR) of children's speech on adults' speech trained acoustic models. On account of large acoustic mismatch between training and test data, highly degraded recognition rates are noted for such cases. Earlier studies have shown that the said acoustic mismatch is aided by the insufficient smoothing of pitch harmonics in the case of mel-frequency cepstral coefficient (MFCC) features for child speakers. Motivated by that, in this work, we explore pitch-adaptive cepstral features for reducing the sensitivity to gross pitch variations. For this purpose, a simple technique based on adaptive-cepstral-truncation is employed for deriving the pitch-adaptive MFCCs. We have also explored the existing STRAIGHT-based MFCCs for contrast. Both the approaches are found to result in significant and similar improvements for children's mismatch ASR case. The effectiveness of the adaptive-truncation-based approach is also demonstrated in context of the deep-neural-network-based acoustic models. Further, it has been shown that the effectiveness of the existing feature normalization techniques remain intact even with the use of the proposed features.
引用
收藏
页数:5
相关论文
共 15 条
  • [1] Pitch-Adaptive Front-end Features for Robust Children's ASR
    Shahnawazuddin, S.
    Dey, Abhishek
    Sinha, Rohit
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3459 - 3463
  • [2] Pitch adaptive MFCC features for improving children's mismatched ASR
    Ghai, Shweta
    Sinha, Rohit
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (03) : 489 - 503
  • [3] Assessment of pitch-adaptive front-end signal processing for children's speech recognition
    Sinha, Rohit
    Shahnawazuddin, S.
    COMPUTER SPEECH AND LANGUAGE, 2018, 48 : 103 - 121
  • [4] Gammatone-Filterbank Based Pitch-Normalized Cepstral Coefficients for Zero-Resource Children's ASR
    Shahnawazuddin, Syed
    Ankita
    Kumar, Avinash
    Kathania, Hemant Kumar
    SPEECH AND COMPUTER, SPECOM 2023, PT I, 2023, 14338 : 494 - 505
  • [5] A Study on the Effect of Pitch on LPCC and PLPC Features for Children's ASR in comparison to MFCC
    Ghai, Shweta
    Sinha, Rohit
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2600 - 2603
  • [6] Improving children's mismatched ASR using structured low-rank feature projection
    Shahnawazuddin, S.
    Kathania, Hemant K.
    Dey, Abhishek
    Sinha, Rohit
    SPEECH COMMUNICATION, 2018, 105 : 103 - 113
  • [7] ENHANCING NOISE AND PITCH ROBUSTNESS OF CHILDREN'S ASR
    Shahnawazuddin, S.
    Deepak, K. T.
    Pradhan, Gayadhar
    Sinha, Rohit
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5225 - 5229
  • [8] Studying the role of pitch-adaptive spectral estimation and speaking-rate normalization in automatic speech recognition
    Shahnawazuddin, S.
    Adiga, Nagaraj
    Kathania, Hemant K.
    Pradhan, Gaydhar
    Sinha, Rohit
    DIGITAL SIGNAL PROCESSING, 2018, 79 : 142 - 151
  • [9] Exploring the Role of Spectral Smoothing in context of Children's Speech Recognition
    Ghai, Shweta
    Sinha, Rohit
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 1571 - 1574
  • [10] Pitch-Normalized Acoustic Features for Robust Children's Speech Recognition
    Shahnawazuddin, Syed
    Sinha, Rohit
    Pradhan, Gayadhar
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (08) : 1128 - 1132