Automatic speaker recognition from speech signal using bidirectional long-short-term memory recurrent neural network

被引:9
|
作者
Devi, Kharibam Jilenkumari [1 ]
Thongam, Khelchandra [2 ]
机构
[1] Natl Inst Technol Manipur, Dept Elect & Commun Engn, Imphal 795004, Manipur, India
[2] Natl Inst Technol Manipur, Dept Comp Sci & Engn, Imphal, Manipur, India
关键词
Mel-frequency cepstral coefficient; probabilistic principal component analysis; recurrent neural network-bidirectional long short term memory; Wiener filter algorithm; IDENTIFICATION; VERIFICATION;
D O I
10.1111/coin.12278
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Speaker recognition is a major challenge in various languages for researchers. For programmed speaker recognition structure prepared by utilizing ordinary speech, shouting creates a confusion between the enlistment and test, henceforth minimizing the identification execution as extreme vocal exertion is required during shouting. Speaker recognition requires more time for classification of data, accuracy is optimized, and the low root-mean-square error rate is the major problem. The objective of this work is to develop an efficient system of speaker recognition. In this work, an improved method of Wiener filter algorithm is applied for better noise reduction. To obtain the essential feature vector values, Mel-frequency cepstral coefficient feature extraction method is used on the noise-removed signals. Furthermore, input samples are created by using these extracted features after the dimensions have been reduced using probabilistic principal component analysis. Finally, recurrent neural network-bidirectional long-short-term memory is used for the classification to improve the prediction accuracy. For checking the effectiveness, the proposed work is compared with the existing methods based on accuracy, sensitivity, and error rate. The results obtained with the proposed method demonstrate an accuracy of 95.77%.
引用
收藏
页码:170 / 193
页数:24
相关论文
共 50 条
  • [1] Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition
    Oruh, Jane
    Viriri, Serestina
    Adegun, Adekanmi
    IEEE ACCESS, 2022, 10 : 30069 - 30079
  • [2] SYLLABIFICATION OF CONVERSATIONAL SPEECH USING BIDIRECTIONAL LONG-SHORT-TERM MEMORY NEURAL NETWORKS
    Landsiedel, Christian
    Edlund, Jens
    Eyben, Florian
    Neiberg, Daniel
    Schuller, Bjoern
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5256 - 5259
  • [3] Convolutional Grid Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition
    Xue, Jiabin
    Zheng, Tieran
    Han, Jiqing
    NEURAL INFORMATION PROCESSING, ICONIP 2019, PT V, 2019, 1143 : 718 - 726
  • [4] BIDIRECTIONAL QUATERNION LONG SHORT-TERM MEMORY RECURRENT NEURAL NETWORKS FOR SPEECH RECOGNITION
    Parcollet, Titouan
    Morchid, Mohamed
    Linares, Georges
    De Mori, Renato
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 8519 - 8523
  • [5] Rotating Machinery Fault Diagnosis Using Long-short-term Memory Recurrent Neural Network
    Yang, Rui
    Huang, Mengjie
    Lu, Qidong
    Zhong, Maiying
    IFAC PAPERSONLINE, 2018, 51 (24): : 228 - 232
  • [6] Long short-term memory recurrent-neural-network-based bandwidth extension for automatic speech recognition
    Tachioka, Yuuki
    Ishii, Jun
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2016, 37 (06) : 319 - 321
  • [7] Electricity Consumption Forecasting Based on a Bidirectional Long-Short-Term Memory Artificial Neural Network
    Petrosanu, Dana-Mihaela
    Pirjan, Alexandru
    SUSTAINABILITY, 2021, 13 (01) : 1 - 31
  • [8] Terahertz Spectral Recognition Based on Bidirectional Long Short-Term Memory Recurrent Neural Network
    Yu Hao-yue
    Shen Tao
    Zhu Yan
    Liu Ying-li
    Yu Zheng-tao
    SPECTROSCOPY AND SPECTRAL ANALYSIS, 2019, 39 (12) : 3737 - 3742
  • [9] Deep causal speech enhancement and recognition using efficient long-short term memory Recurrent Neural Network
    Li, Zhenqing
    Basit, Abdul
    Daraz, Amil
    Jan, Atif
    PLOS ONE, 2024, 19 (01):
  • [10] BIDIRECTIONAL RECURRENT NEURAL NETWORK LANGUAGE MODELS FOR AUTOMATIC SPEECH RECOGNITION
    Arisoy, Ebru
    Sethy, Abhinav
    Ramabhadran, Bhuvana
    Chen, Stanley
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5421 - 5425