Text-independent speaker recognition using LSTM-RNN and speech enhancement

被引：0

作者：

Samia Abd El-Moneim

M. A. Nassar

Moawad I. Dessouky

Nabil A. Ismail

Adel S. El-Fishawy

Fathi E. Abd El-Samie

机构：

[1] Tanta High Institute of Engineering and Technology,Department of Electrical Communications

[2] Faculty of Electronic Engineering,Department of Electronics and Electrical Communications Engineering

[3] Menoufia University,Department of Computer Science and Engineering

[4] Faculty of Electronic Engineering,Department of Information Technology, College of Computer and Information sciences

[5] Menoufia University,undefined

[6] Princess Nourah Bint Abdulrahman University,undefined

来源：

Multimedia Tools and Applications | 2020年 / 79卷

关键词：

Speaker recognition; MFCCs; Spectrum; Log-spectrum; LSTM-RNN; Reverberation; Speech enhancement;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Speaker recognition revolution has lead to the inclusion of speaker recognition modules in several commercial products. Most published algorithms for speaker recognition focus on text-dependent speaker recognition. In contrast, text-independent speaker recognition is more advantageous as the client can talk freely to the system. In this paper, text-independent speaker recognition is considered in the presence of some degradation effects such as noise and reverberation. Mel-Frequency Cepstral Coefficients (MFCCs), spectrum and log-spectrum are used for feature extraction from the speech signals. These features are processed with the Long-Short Term Memory Recurrent Neural Network (LSTM-RNN) as a classification tool to complete the speaker recognition task. The network learns to recognize the speakers efficiently in a text-independent manner, when the recording circumstances are the same. The recognition rate reaches 95.33% using MFCCs, while it is increased to 98.7% when using spectrum or log-spectrum. However, the system has some challenges to recognize speakers from different recording environments. Hence, different speech enhancement techniques, such as spectral subtraction and wavelet denoising, are used to improve the recognition performance to some extent. The proposed approach shows superiority, when compared to the algorithm of R. Togneri and D. Pullella (2011).

引用

页码：24013 / 24028

页数：15

共 50 条

[1] Text-independent speaker recognition using LSTM-RNN and speech enhancement
Abd El-Moneim, Samia
Nassar, M. A.
Dessouky, Moawad I.
Ismail, Nabil A.
El-Fishawy, Adel S.
Abd El-Samie, Fathi E.
MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (33-34) : 24013 - 24028
[2] TEXT-INDEPENDENT SPEAKER RECOGNITION
ATAL, BS
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1972, 52 (01): : 181 - &
[3] TEXT-INDEPENDENT SPEAKER RECOGNITION USING NEURAL NETWORKS
HATTORI, H
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1993, E76D (03) : 345 - 351
[4] Text-independent speaker recognition using graph matching
Hautamaki, Ville
Kinnunen, Tomi
Franti, Pasi
PATTERN RECOGNITION LETTERS, 2008, 29 (09) : 1427 - 1432
[5] A novel speech feature fusion algorithm for text-independent speaker recognition
Ma, Biao
Xu, Chengben
Zhang, Ye
MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (24) : 64139 - 64156
[6] Bengali Speech Recognition: A Double Layered LSTM-RNN Approach
Nahid, Md Mahadi Hasan
Purkaystha, Bishwajit
Islam, Md Saiful
2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,
[7] Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
El-Moneim, Samia Abd
Sedik, Ahmed
Nassar, M. A.
El-Fishawy, Adel S.
Sharshar, A. M.
Hassan, Shaimaa E. A.
Mahmoud, Adel Zaghloul
Dessouky, Moawd I.
El-Banby, Ghada M.
El-Samie, Fathi E. Abd
El-Rabaie, El-Sayed M.
Neyazi, Badawi
Seddeq, H. S.
Ismail, Nabil A.
Khalaf, Ashraf A. M.
Elabyad, G. S. M.
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (04) : 993 - 1006
[8] Effect of Spoken Text on Text-independent Speaker Recognition
Alsulaiman, Mansour
PROCEEDINGS FIFTH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION, 2014, : 279 - 284
[9] Text-dependent and text-independent speaker recognition of reverberant speech based on CNN
Samia Abd El-Moneim
Ahmed Sedik
M. A. Nassar
Adel S. El-Fishawy
A. M. Sharshar
Shaimaa E. A. Hassan
Adel Zaghloul Mahmoud
Moawd I. Dessouky
Ghada M. El-Banby
Fathi E. Abd El-Samie
El-Sayed M. El-Rabaie
Badawi Neyazi
H. S. Seddeq
Nabil A. Ismail
Ashraf A. M. Khalaf
G. S. M. Elabyad
International Journal of Speech Technology, 2021, 24 : 993 - 1006
[10] Text-independent speaker recognition using support vector machine
Hou, FL
Wang, BX
2001 INTERNATIONAL CONFERENCES ON INFO-TECH AND INFO-NET PROCEEDINGS, CONFERENCE A-G: INFO-TECH & INFO-NET: A KEY TO BETTER LIFE, 2001, : C402 - C407

← 1 2 3 4 5 →