Time-Varying LP Cepstral Features for Improved Isolated Word Speech Recognition

被引:0
|
作者
Ang, Federico [1 ]
Tsutsui, Hiroshi [1 ]
Miyanaga, Yoshikazu [1 ]
机构
[1] Hokkaido Univ, ICN Lab, Sapporo, Hokkaido 0600814, Japan
来源
2015 IEEE INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP) | 2015年
关键词
time-varying AR model; isolated word speech recognition; LINEAR PREDICTION;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Isolated word speech recognition for small vocabulary tasks has found great success with Mel-frequency cepstral coefficients as the speech feature of choice. Voice-controlled embedded systems, using word models as the basic units of speech, have found their way in a variety of commercial products. While the recognition rates for these products can be considered commercially acceptable under clean environments, channel noise and other external factors can still degrade recognition performance in practice. We propose the use of cepstral features derived from time-varying linear predictive coding, where the autoregressive model of the speech signal is represented by coefficients that are linear combinations of some simple basis functions. Variations in the usage of the features are investigated, such as skipping adjacent features, averaging and hybrid features with the goal of improving the performance of a 142 vocabulary, isolated words Japanese speech recognition task.
引用
收藏
页码:302 / 306
页数:5
相关论文
共 50 条
  • [1] Incorporation of Time-Varying LP Cepstral Features in HMM-Based Isolated Word Speech Recognition
    Ang, Federico
    Tsutsui, Hiroshi
    Miyanaga, Yoshikazu
    2015 INTERNATIONAL SYMPOSIUM ON SIGNALS, CIRCUITS AND SYSTEMS (ISSCS), 2015,
  • [2] A method of extracting time-varying acoustic features effective for speech recognition
    Tanaka, K
    Kojima, H
    1997 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I - V: VOL I: PLENARY, EXPERT SUMMARIES, SPECIAL, AUDIO, UNDERWATER ACOUSTICS, VLSI; VOL II: SPEECH PROCESSING; VOL III: SPEECH PROCESSING, DIGITAL SIGNAL PROCESSING; VOL IV: MULTIDIMENSIONAL SIGNAL PROCESSING, NEURAL NETWORKS - VOL V: STATISTICAL SIGNAL AND ARRAY PROCESSING, APPLICATIONS, 1997, : 1391 - 1394
  • [3] Speech recognition using cepstral articulatory features
    Najnin, Shamima
    Banerjee, Bonny
    SPEECH COMMUNICATION, 2019, 107 : 26 - 37
  • [4] Joint Bayesian Estimation of Time-Varying LP Parameters and Excitation for Speech
    Chetupalli, Srikanth Raj
    Sreenivas, T. V.
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (04) : 357 - 361
  • [5] Recursive estimation of time-varying environments for robust speech recognition
    Zhao, YX
    Wang, SJ
    Yen, KC
    2001 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS I-VI, PROCEEDINGS: VOL I: SPEECH PROCESSING 1; VOL II: SPEECH PROCESSING 2 IND TECHNOL TRACK DESIGN & IMPLEMENTATION OF SIGNAL PROCESSING SYSTEMS NEURALNETWORKS FOR SIGNAL PROCESSING; VOL III: IMAGE & MULTIDIMENSIONAL SIGNAL PROCESSING MULTIMEDIA SIGNAL PROCESSING - VOL IV: SIGNAL PROCESSING FOR COMMUNICATIONS; VOL V: SIGNAL PROCESSING EDUCATION SENSOR ARRAY & MULTICHANNEL SIGNAL PROCESSING AUDIO & ELECTROACOUSTICS; VOL VI: SIGNAL PROCESSING THEORY & METHODS STUDENT FORUM, 2001, : 225 - 228
  • [6] Robust speech recognition with time-varying filtering, interruptions, and noise
    Lippmann, RP
    Carlson, BA
    1997 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING, PROCEEDINGS, 1997, : 365 - 372
  • [7] Robust Speech Recognition Combining Cepstral and Articulatory Features
    Zha, Zhuan-ling
    Hu, Jin
    Zhan, Qing-ran
    Shan, Ya-hui
    Xie, Xiang
    Wang, Jing
    Cheng, Hao-bo
    PROCEEDINGS OF 2017 3RD IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATIONS (ICCC), 2017, : 1401 - 1405
  • [8] NMF-based Cepstral Features for Speech Emotion Recognition
    Lashkari, Milad
    Seyedin, Sanaz
    2018 4TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS), 2018, : 189 - 193
  • [9] Multiple-microphone time-varying filters for robust speech recognition
    Lai, CYK
    Aarabi, P
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 233 - 236
  • [10] BANGLA ISOLATED WORD SPEECH RECOGNITION
    Firoze, Adnan
    Arifin, M. Shamsul
    Quadir, Ryana
    Rahman, Rashedur M.
    ICEIS 2011: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 2, 2011, : 73 - 82