English speech recognition based on deep learning with multiple features

被引:2
|
作者
Zhaojuan Song
机构
[1] School of Translation Studies of Qufu Normal University,
来源
Computing | 2020年 / 102卷
关键词
Deep neural network; Fusion; Speech recognition; Multiple features; 68T10; 68T35; 68T50;
D O I
暂无
中图分类号
学科分类号
摘要
English is one of the widely used languages, with the shrinking of the global village, the smart home, the in-vehicle voice system and voice recognition software with English as the recognition language have gradually entered people’s field of vision, and have obtained the majority of users’ love by the practical accuracy. And deep learning technology in many tasks with its hierarchical feature learning ability and data modeling capabilities has achieved more than the performance of shallow learning technology. Therefore, this paper takes English speech as the research object, and proposes a deep learning speech recognition algorithm that combines speech features and speech attributes. Firstly, the deep neural network supervised learning method is used to extract the high-level features of the speech, select the output of the fixed hidden layer as the new speech feature for the newly generated network, and train the GMM–HMM acoustic model with the new speech features; secondly, the speech attribute extractor based on deep neural network is trained for multiple speech attributes, and the extracted speech attributes are classified into phoneme by deep neural network; finally, speech features and speech attribute features are merged into the same CNN framework by the neural network based on the linear feature fusion algorithm. The experimental results show that the proposed English speech recognition algorithm based on deep neural network with multiple features can directly and effectively combine the two methods by combining the speech features and the speech attributes of the speaker in the input layer of the deep neural network, and it can improve the performance of the English speech recognition system significantly.
引用
收藏
页码:663 / 682
页数:19
相关论文
共 50 条
  • [41] Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation
    Seki, Hiroshi
    Yamamoto, Kazumasa
    Akiba, Tomoyosi
    Nakagawa, Seiichi
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2019, E102D (02) : 364 - 374
  • [42] Acceleration Strategies for Speech Recognition based on Deep Neural Networks
    Tian, Chao
    Liu, Jia
    Peng, Zhaomeng
    MECHATRONICS ENGINEERING, COMPUTING AND INFORMATION TECHNOLOGY, 2014, 556-562 : 5181 - 5185
  • [43] Network Oral English Teaching System Based on Speech Recognition Technology and Deep Neural Network
    He, Na
    Liu, Weihua
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (12) : 829 - 839
  • [44] Network Oral English Teaching System Based on Speech Recognition Technology and Deep Neural Network
    He N.
    Liu W.
    International Journal of Advanced Computer Science and Applications, 2023, 14 (12): : 829 - 839
  • [45] On Comparison of Deep Learning Architectures for Distant Speech Recognition
    Sustika, Rika
    Yuliani, Asri R.
    Zaenudin, Efendi
    Pardede, Hilman F.
    2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 17 - 21
  • [46] Evaluating deep learning architectures for Speech Emotion Recognition
    Fayek, Haytham M.
    Lech, Margaret
    Cavedon, Lawrence
    NEURAL NETWORKS, 2017, 92 : 60 - 68
  • [47] Transfer learning from English to Slovak in speech recognition applications
    Buday, Anton
    Juhar, Jozef
    Cizmar, Anton
    2023 33RD INTERNATIONAL CONFERENCE RADIOELEKTRONIKA, RADIOELEKTRONIKA, 2023,
  • [48] Deep Learning based Location Prediction with Multiple Features in Communication Network
    Gao, Yin
    Chen, Jiajun
    Liu, Zhuang
    Liu, Liang
    Hu, Nan
    2021 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2021,
  • [49] Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition
    Lei Sun
    Jun Du
    Zhipeng Xie
    Yong Xu
    Journal of Signal Processing Systems, 2018, 90 : 975 - 983
  • [50] Auxiliary Features from Laser-Doppler Vibrometer Sensor for Deep Neural Network Based Robust Speech Recognition
    Sun, Lei
    Du, Jun
    Xie, Zhipeng
    Xu, Yong
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 975 - 983