A focus module-based lightweight end-to-end CNN framework for voiceprint recognition

被引:9
|
作者
Velayuthapandian, Karthikeyan [1 ]
Subramoniam, Suja Priyadharsini [2 ]
机构
[1] Mepco Schlenk Engn Coll, Dept Elect & Commun Engn, Sivakasi, Tamil Nadu, India
[2] Anna Univ Reg Campus, Dept Elect & Commun Engn, Tirunelveli, Tamil Nadu, India
关键词
Speaker recognition; Deep neural network; Spectrogram; 1-D CNN; Focus module; SUPPORT VECTOR MACHINES; SPEAKER; SYSTEM;
D O I
10.1007/s11760-023-02500-7
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The process of identifying a spokesperson from a collection of subsequent time series data is referred to as speaker identification. Convolutional neural networks (CNNs) and deep neural networks are the two types of neural networks that are used in the majority of modern experimental approaches. This work presents a CNN model for speaker identification using a jump-connected one-dimensional convolutional neural network (1-D CNN) with a focus module (FM). The 1-D convolutional layer integrated with FM is employed in the presented model for speaker characteristic extraction and lessens heterogeneity in the temporal and spatial domains, allowing for quicker layer processing. Furthermore, the layered CNN hopping interconnection is employed to overcome the connectivity glitches, and a solution based on softmax loss and smooth L1-norm combined regulation is presented to increase efficiency. The recommended network model was evaluated using the ELSDSR, TIMIT, NIST, 16,000 PCM, and experimental audio datasets. According to experimental data, the equal error rate (EER) of end-to-end CNN for voiceprint identification is 9.02% higher than baseline approaches. In experiments, our proposed speaker recognition (SR) model, which we refer to as the deep FM-1D CNN, had a high recognition accuracy of 99.21%. Moreover, the observations demonstrate that the proposed network model is more robust than other models.
引用
收藏
页码:2817 / 2825
页数:9
相关论文
共 50 条
  • [21] UTTERANCE-LEVEL END-TO-END LANGUAGE IDENTIFICATION USING ATTENTION-BASED CNN-BLSTM
    Cai, Weicheng
    Cai, Danwei
    Huang, Shen
    Li, Ming
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5991 - 5995
  • [22] END-TO-END LANGUAGE RECOGNITION USING ATTENTION BASED HIERARCHICAL GATED RECURRENT UNIT MODELS
    Padi, Bharat
    Mohan, Anand
    Ganapathy, Sriram
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5966 - 5970
  • [23] A High Performance FPGA-based Accelerator Design for End-to-End Speaker Recognition System
    Jiao, Mingjun
    Li, Yue
    Dang, Pengbo
    Cao, Wei
    Wang, Lingli
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 215 - 223
  • [24] End-to-end emotional speech recognition using acoustic model adaptation based on knowledge distillation
    Hong-In Yun
    Jeong-Sik Park
    Multimedia Tools and Applications, 2023, 82 : 22759 - 22776
  • [25] AN END-TO-END MULTITASK LEARNING MODEL TO IMPROVE SPEECH EMOTION RECOGNITION
    Fu, Changzeng
    Liu, Chaoran
    Ishi, Carlos Toshinori
    Ishiguro, Hiroshi
    28TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2020), 2021, : 351 - 355
  • [26] An end-to-end framework for the detection of mathematical expressions in scientific document images
    Phong, Bui Hai
    Hoang, Thang Manh
    Le, Thi-Lan
    EXPERT SYSTEMS, 2022, 39 (01)
  • [27] Protecting the Ownership of Deep Learning Models with An End-to-End Watermarking Framework
    Zhang, Wei
    Cui, Wenxue
    Jiang, Feng
    Yang, Chifu
    Li, Ran
    2021 IEEE 20TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2021), 2021, : 76 - 82
  • [28] RefineNet-based End-to-end Speech Enhancement
    Lan T.
    Peng C.
    Li S.
    Qian Y.-X.
    Chen C.
    Liu Q.
    Zidonghua Xuebao/Acta Automatica Sinica, 2022, 48 (02): : 554 - 563
  • [29] Application of End-to-End Perception Framework Based on Boosted DETR in UAV Inspection of Overhead Transmission Lines
    Wang, Jinyu
    Jin, Lijun
    Li, Yingna
    Cao, Pei
    DRONES, 2024, 8 (10)
  • [30] FPJA-Net: A Lightweight End-to-End Network for Sleep Stage Prediction Based on Feature Pyramid and Joint Attention
    Liu, Zhi
    Zhang, Qinhan
    Luo, Sixin
    Qin, Meiqiao
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2024, 16 (04) : 769 - 780