DEEP LEARNING OF SPLIT TEMPORAL CONTEXT FOR AUTOMATIC SPEECH RECOGNITION

被引:0
作者
Baccouche, Moez [1 ]
Besset, Benoit [1 ]
Collen, Patrice [1 ]
Le Blouch, Olivier [1 ]
机构
[1] France Telecom, Orange Labs, F-35510 Cesson Sevigne, France
来源
2014 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2014年
关键词
Speech recognition; neural networks; deep learning; split temporal context;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper follows the recent advances in speech recognition which recommend replacing the standard hybrid GMM/HMM approach by deep neural architectures. These models were shown to drastically improve recognition performances, due to their ability to capture the underlying structure of data. However, they remain particularly complex since the entire temporal context of a given phoneme is learned with a single model, which must therefore have a very large number of trainable weights. This work proposes an alternative solution that splits the temporal context into blocks, each learned with a separate deep model. We demonstrate that this approach significantly reduces the number of parameters compared to the classical deep learning procedure, and obtains better results on the TIMIT dataset, among the best of state-of-the-art (with a 20.20% PER). We also show that our approach is able to assimilate data of different nature, ranging from wide to narrow bandwidth signals.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Emotion Recognition from Human Speech Using Temporal Information and Deep Learning
    Kim, John W.
    Saurous, Rif A.
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 937 - 940
  • [22] HindiSpeech-Net: a deep learning based robust automatic speech recognition system for Hindi language
    Sharma, Usha
    Om, Hari
    Mishra, A. N.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (11) : 16173 - 16193
  • [23] On the Correlation and Transferability of Features between Automatic Speech Recognition and Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 3618 - 3622
  • [24] HindiSpeech-Net: a deep learning based robust automatic speech recognition system for Hindi language
    Usha Sharma
    Hari Om
    A. N. Mishra
    Multimedia Tools and Applications, 2023, 82 : 16173 - 16193
  • [25] On Comparison of Deep Learning Architectures for Distant Speech Recognition
    Sustika, Rika
    Yuliani, Asri R.
    Zaenudin, Efendi
    Pardede, Hilman F.
    2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 17 - 21
  • [26] Lightweight Deep Learning Framework for Speech Emotion Recognition
    Akinpelu, Samson
    Viriri, Serestina
    Adegun, Adekanmi
    IEEE ACCESS, 2023, 11 : 77086 - 77098
  • [27] Deep Learning of Speech Features for Improved Phonetic Recognition
    Lee, Jaehyung
    Lee, Soo-Young
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 1256 - 1259
  • [28] A temporal auditory model with adaptation for automatic speech recognition
    Haque, Serajul
    Togneri, Roberto
    Zaknich, Anthony
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1141 - +
  • [29] SAYS WHO? DEEP LEARNING MODELS FOR JOINT SPEECH RECOGNITION, SEGMENTATION AND DIARIZATION
    Sarkar, Amitrajit
    Dasgupta, Surajit
    Naskar, Sudip Kumar
    Bandyopadhyay, Sivaji
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5229 - 5233
  • [30] Deep-Learning-Based BCI for Automatic Imagined Speech Recognition Using SPWVD
    Kamble, Ashwin
    Ghare, Pradnya H.
    Kumar, Vinay
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72