Speaker Recognition with Deep Learning Approaches: A Review

被引:0
|
作者
Alenizi, Abdulrahman S. [1 ]
Al-Karawi, Khamis A. [2 ]
机构
[1] PAAET, Shuwaikh Ind, Kuwait
[2] Diyala Univ, Baqubah, Diyala, Iraq
来源
PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 5, ICICT 2024 | 2024年 / 1000卷
关键词
Deep learning text independence; Feature extraction; Statistical models; Discriminative models; Speaker identification; And speaker verification; MACHINES; NOISE;
D O I
10.1007/978-981-97-3289-0_39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article gives an overview of the methods for using deep learning to identify and verify speakers. Speaker recognition is an everyday use of speech technology. Many research initiatives have been carried out in the past few years, but little progress has been achieved. But just as deep learning techniques are replacing previous state-of-the-art approaches in speech recognition, they are also developing in most machine learning fields. Deep learning seems to have evolved into the most advanced speaker verification and identification technique. Most novel efforts start with the common x-vectors in addition to i-vectors. The increasing volume of data gathered makes the area where deep learning is most effective more accessible.
引用
收藏
页码:481 / 499
页数:19
相关论文
共 50 条
  • [11] A review on speaker recognition: Technology and challenges
    Hanifa, Rafizah Mohd
    Isa, Khalid
    Mohamad, Shamsul
    COMPUTERS & ELECTRICAL ENGINEERING, 2021, 90
  • [12] Deep Learning in Barcode Recognition: A Systematic Literature Review
    Wudhikarn, Ratapol
    Charoenkwan, Phasit
    Malang, Kanokwan
    IEEE ACCESS, 2022, 10 : 8049 - 8072
  • [13] Multi-Noise Representation Learning for Robust Speaker Recognition
    Cho, Sunyoung
    Wee, Kyungchul
    IEEE SIGNAL PROCESSING LETTERS, 2025, 32 : 681 - 685
  • [14] Music Emotion Recognition Based on Deep Learning: A Review
    Jiang, Xingguo
    Zhang, Yuchao
    Lin, Guojun
    Yu, Ling
    IEEE ACCESS, 2024, 12 : 157716 - 157745
  • [15] Speaker recognition based on pre-processing approaches
    Samia Abd El-Moneim
    El-Sayed M. EL-Rabaie
    M. A. Nassar
    Moawad I. Dessouky
    Nabil A. Ismail
    Adel S. El-Fishawy
    Fathi E. Abd El-Samie
    International Journal of Speech Technology, 2020, 23 : 435 - 442
  • [16] Speaker recognition based on pre-processing approaches
    Abd El-Moneim, Samia
    El-Rabaie, El-Sayed Mahmoud
    Nassar, M. A.
    Dessouky, Moawad, I
    Ismail, Nabil A.
    El-Fishawy, Adel S.
    Abd El-Samie, Fathi E.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (02) : 435 - 442
  • [17] Impact of Deep Learning Approaches on Facial Expression Recognition in Healthcare Industries
    Bisogni, Carmen
    Castiglione, Aniello
    Hossain, Sanoar
    Narducci, Fabio
    Umer, Saiyed
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (08) : 5619 - 5627
  • [18] Analyzing Noise Robustness of Cochleogram and Mel Spectrogram Features in Deep Learning Based Speaker Recognition
    Lambamo, Wondimu
    Srinivasagan, Ramasamy
    Jifara, Worku
    APPLIED SCIENCES-BASEL, 2023, 13 (01):
  • [19] Latent discriminative representation learning for speaker recognition
    Huang, Duolin
    Mao, Qirong
    Ma, Zhongchen
    Zheng, Zhishen
    Routryar, Sidheswar
    Ocquaye, Elias-Nii-Noi
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2021, 22 (05) : 697 - 708
  • [20] Learning statistically efficient features for speaker recognition
    Jang, GJ
    Lee, TW
    Oh, YH
    NEUROCOMPUTING, 2002, 49 : 329 - 348