Text-independent speaker recognition using LSTM-RNN and speech enhancement

被引:0
|
作者
Samia Abd El-Moneim
M. A. Nassar
Moawad I. Dessouky
Nabil A. Ismail
Adel S. El-Fishawy
Fathi E. Abd El-Samie
机构
[1] Tanta High Institute of Engineering and Technology,Department of Electrical Communications
[2] Faculty of Electronic Engineering,Department of Electronics and Electrical Communications Engineering
[3] Menoufia University,Department of Computer Science and Engineering
[4] Faculty of Electronic Engineering,Department of Information Technology, College of Computer and Information sciences
[5] Menoufia University,undefined
[6] Princess Nourah Bint Abdulrahman University,undefined
来源
Multimedia Tools and Applications | 2020年 / 79卷
关键词
Speaker recognition; MFCCs; Spectrum; Log-spectrum; LSTM-RNN; Reverberation; Speech enhancement;
D O I
暂无
中图分类号
学科分类号
摘要
Speaker recognition revolution has lead to the inclusion of speaker recognition modules in several commercial products. Most published algorithms for speaker recognition focus on text-dependent speaker recognition. In contrast, text-independent speaker recognition is more advantageous as the client can talk freely to the system. In this paper, text-independent speaker recognition is considered in the presence of some degradation effects such as noise and reverberation. Mel-Frequency Cepstral Coefficients (MFCCs), spectrum and log-spectrum are used for feature extraction from the speech signals. These features are processed with the Long-Short Term Memory Recurrent Neural Network (LSTM-RNN) as a classification tool to complete the speaker recognition task. The network learns to recognize the speakers efficiently in a text-independent manner, when the recording circumstances are the same. The recognition rate reaches 95.33% using MFCCs, while it is increased to 98.7% when using spectrum or log-spectrum. However, the system has some challenges to recognize speakers from different recording environments. Hence, different speech enhancement techniques, such as spectral subtraction and wavelet denoising, are used to improve the recognition performance to some extent. The proposed approach shows superiority, when compared to the algorithm of R. Togneri and D. Pullella (2011).
引用
收藏
页码:24013 / 24028
页数:15
相关论文
共 50 条
  • [1] Text-independent speaker recognition using LSTM-RNN and speech enhancement
    Abd El-Moneim, Samia
    Nassar, M. A.
    Dessouky, Moawad I.
    Ismail, Nabil A.
    El-Fishawy, Adel S.
    Abd El-Samie, Fathi E.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (33-34) : 24013 - 24028
  • [2] TEXT-INDEPENDENT SPEAKER RECOGNITION
    ATAL, BS
    JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1972, 52 (01): : 181 - &
  • [4] Text-independent speaker recognition using graph matching
    Hautamaki, Ville
    Kinnunen, Tomi
    Franti, Pasi
    PATTERN RECOGNITION LETTERS, 2008, 29 (09) : 1427 - 1432
  • [5] A novel speech feature fusion algorithm for text-independent speaker recognition
    Ma, Biao
    Xu, Chengben
    Zhang, Ye
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 64139 - 64156
  • [6] Bengali Speech Recognition: A Double Layered LSTM-RNN Approach
    Nahid, Md Mahadi Hasan
    Purkaystha, Bishwajit
    Islam, Md Saiful
    2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
  • [7] Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
    El-Moneim, Samia Abd
    Sedik, Ahmed
    Nassar, M. A.
    El-Fishawy, Adel S.
    Sharshar, A. M.
    Hassan, Shaimaa E. A.
    Mahmoud, Adel Zaghloul
    Dessouky, Moawd I.
    El-Banby, Ghada M.
    El-Samie, Fathi E. Abd
    El-Rabaie, El-Sayed M.
    Neyazi, Badawi
    Seddeq, H. S.
    Ismail, Nabil A.
    Khalaf, Ashraf A. M.
    Elabyad, G. S. M.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (04) : 993 - 1006
  • [8] Effect of Spoken Text on Text-independent Speaker Recognition
    Alsulaiman, Mansour
    PROCEEDINGS FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION, 2014, : 279 - 284
  • [9] Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
    Samia Abd El-Moneim
    Ahmed Sedik
    M. A. Nassar
    Adel S. El-Fishawy
    A. M. Sharshar
    Shaimaa E. A. Hassan
    Adel Zaghloul Mahmoud
    Moawd I. Dessouky
    Ghada M. El-Banby
    Fathi E. Abd El-Samie
    El-Sayed M. El-Rabaie
    Badawi Neyazi
    H. S. Seddeq
    Nabil A. Ismail
    Ashraf A. M. Khalaf
    G. S. M. Elabyad
    International Journal of Speech Technology, 2021, 24 : 993 - 1006
  • [10] Text-independent speaker recognition using support vector machine
    Hou, FL
    Wang, BX
    2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : C402 - C407