Efficient speaker identification using spectral entropy

被引:7
|
作者
Luque-Suarez, Fernando [1 ]
Camarena-Ibarrola, Antonio [2 ]
Chavez, Edgar [1 ]
机构
[1] CICESE, Ensenada, Baja California, Mexico
[2] Univ Michoacana, Morelia, Michoacan, Mexico
关键词
Speaker recognition; Speaker identification; Entropygrams; RECOGNITION;
D O I
10.1007/s11042-018-7035-9
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In voice recognition, the two main problems are speech recognition (what was said), and speaker recognition (who was speaking). The usual method for speaker recognition is to postulate a model where the speaker identity corresponds to the parameters of the model, which estimation could be time-consuming when the number of candidate speakers is large. In this paper, we model the speaker as a high dimensional point cloud of entropy-based features, extracted from the speech signal. The method allows indexing, and hence it can manage large databases. We experimentally assessed the quality of the identification with a publicly available database formed by extracting audio from a collection of YouTube videos of 1,000 different speakers. With 20 second audio excerpts, we were able to identify a speaker with 97% accuracy when the recording environment is not controlled, and with 99% accuracy for controlled recording environments.
引用
收藏
页码:16803 / 16815
页数:13
相关论文
共 50 条
  • [21] NMF Based System for Speaker Identification
    Costantini, Giovanni
    Cesarini, Valerio
    Paolizzo, Fabio
    2021 IEEE INTERNATIONAL WORKSHOP ON METROLOGY FOR INDUSTRY 4.0 & IOT (IEEE METROIND4.0 & IOT), 2021, : 620 - 624
  • [22] A robust DNN model for text-independent speaker identification using non-speaker embeddings in diverse data conditions
    Shome, Nirupam
    Saritha, Banala
    Kashyap, Richik
    Laskar, Rabul Hussain
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (26) : 18933 - 18947
  • [23] Super-Dirichlet Mixture Models using Differential Line Spectral Frequencies for Text-Independent Speaker Identification
    Ma, Zhanyu
    Leijon, Arne
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 2360 - +
  • [24] Spatial and Spectral Fingerprint in The Brain: Speaker Identification from Single Trial MEG Signals
    Dash, Debadatta
    Ferrari, Paul
    Wang, Jun
    INTERSPEECH 2019, 2019, : 1203 - 1207
  • [25] Efficient cancelable speaker identification system based on a hybrid structure of DWT and SVD
    Abdelwahab, Khaled M.
    Abd El-atty, Saied
    Brisha, Ayman M.
    Abd El-Samie, Fathi E.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2022, 25 (01) : 279 - 288
  • [26] COMPUTATIONALLY EFFICIENT SPEAKER IDENTIFICATION USING FAST-MLLR BASED ANCHOR MODELING
    Sarkar, A. K.
    Umesh, S.
    Bonastre, J. F.
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 4357 - 4360
  • [27] Spectral Restoration Based Speech Enhancement for Robust Speaker Identification
    Saleem, Nasir
    Tareen, Tayyaba Gul
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2018, 5 (01): : 34 - 39
  • [28] A Deep Neural Network Model for Speaker Identification
    Ye, Feng
    Yang, Jun
    APPLIED SCIENCES-BASEL, 2021, 11 (08):
  • [29] EMARATI SPEAKER IDENTIFICATION
    Shahin, Ismail
    Ba-Hutair, Mohammed Nasser
    2014 12TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP), 2014, : 488 - 493
  • [30] An Approach to Speaker Identification
    Hollien, Harry
    JOURNAL OF FORENSIC SCIENCES, 2016, 61 (02) : 334 - 344