Efficient Integrated Features Based on Pre-trained Models for Speaker Verification

被引:0
|
作者
Li, Yishuang [1 ,2 ]
Guan, Wenhao [3 ]
Huang, Hukai [3 ]
Miao, Shiyu [2 ]
Su, Qi [2 ]
Li, Lin [1 ,2 ]
Hong, Qingyang [3 ]
机构
[1] Xiamen Univ, Inst Artificial Intelligence, Xian, Peoples R China
[2] Xiamen Univ, Sch Elect Sci & Engn, Xian, Peoples R China
[3] Xiamen Univ, Sch Informat, Xian, Peoples R China
来源
INTERSPEECH 2024 | 2024年
基金
中国国家自然科学基金;
关键词
speaker verification; pre-trained models; feature integration; t-SNE; SPEECH;
D O I
10.21437/Interspeech.2024-1889
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work has explored the application of pre-trained models (PTMs) in speaker verification(SV). Most researchers directly replaced handcrafted features with the universal representations of the PTMs, and jointly fine-tuned PTMs with the downstream SV networks, which undoubtedly discarded important spectral information contained in handcrafted features and also increased the training cost. In this paper, we proposed an efficient feature integration method that utilized a Fine-grained Fusion Module to fuse the multi-layer representations of the PTMs adaptively. Then we integrated the fused representations with handcrafted features to obtain the integrated features, which were subsequently fed into the SV network. The experimental results demonstrated that using the integrated features effectively enhanced the performance of the SV systems, and yielded decent results with no need to fine-tune the PTMs. Moreover, employing full-parameter fine-tuning led to the current optimal results.
引用
收藏
页码:2140 / 2144
页数:5
相关论文
共 50 条
  • [31] Speaker Verification based on extraction of Deep Features
    Mitsianis, Evangelos
    Spyrou, Evaggelos
    Giannakopoulos, Theodore
    10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,
  • [32] Domain knowledge-infused pre-trained deep learning models for efficient white blood cell classification
    P. Jeneessha
    Vinoth Kumar Balasubramanian
    Scientific Reports, 15 (1)
  • [33] Assessment of Convolutional Neural Network Pre-Trained Models for Detection and Orientation of Cracks
    Qayyum, Waqas
    Ehtisham, Rana
    Bahrami, Alireza
    Camp, Charles
    Mir, Junaid
    Ahmad, Afaq
    MATERIALS, 2023, 16 (02)
  • [34] Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models
    Li, Wenbiao
    Sun, Rui
    Wu, Yunfang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 3 - 15
  • [35] Sprelog: Log-Based Anomaly Detection with Self-matching Networks and Pre-trained Models
    Yang, Haitian
    Zhao, Xuan
    Sun, Degang
    Wang, Yan
    Huang, Weiqing
    SERVICE-ORIENTED COMPUTING (ICSOC 2021), 2021, 13121 : 736 - 743
  • [36] Autoassociator-based models for speaker verification
    Gori, M
    Lastrucci, L
    Soda, G
    PATTERN RECOGNITION LETTERS, 1996, 17 (03) : 241 - 250
  • [37] Comprehensive study of pre-trained language models: detecting humor in news headlines
    Farah Shatnawi
    Malak Abdullah
    Mahmoud Hammad
    Mahmoud Al-Ayyoub
    Soft Computing, 2023, 27 : 2575 - 2599
  • [38] SAR Image Despeckling Using Pre-trained Convolutional Neural Network Models
    Yang, Xiangli
    Denis, Loic
    Tupin, Florence
    Yang, Wen
    2019 JOINT URBAN REMOTE SENSING EVENT (JURSE), 2019,
  • [39] Pre-trained models for detection and severity level classification of dysarthria from speech
    Javanmardi, Farhad
    Kadiri, Sudarsana Reddy
    Alku, Paavo
    SPEECH COMMUNICATION, 2024, 158
  • [40] Assessing and improving syntactic adversarial robustness of pre-trained models for code translation
    Yang, Guang
    Zhou, Yu
    Zhang, Xiangyu
    Chen, Xiang
    Han, Tingting
    Chen, Taolue
    INFORMATION AND SOFTWARE TECHNOLOGY, 2025, 181