Efficient Integrated Features Based on Pre-trained Models for Speaker Verification

被引:0
|
作者
Li, Yishuang [1 ,2 ]
Guan, Wenhao [3 ]
Huang, Hukai [3 ]
Miao, Shiyu [2 ]
Su, Qi [2 ]
Li, Lin [1 ,2 ]
Hong, Qingyang [3 ]
机构
[1] Xiamen Univ, Inst Artificial Intelligence, Xian, Peoples R China
[2] Xiamen Univ, Sch Elect Sci & Engn, Xian, Peoples R China
[3] Xiamen Univ, Sch Informat, Xian, Peoples R China
来源
INTERSPEECH 2024 | 2024年
基金
中国国家自然科学基金;
关键词
speaker verification; pre-trained models; feature integration; t-SNE; SPEECH;
D O I
10.21437/Interspeech.2024-1889
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Previous work has explored the application of pre-trained models (PTMs) in speaker verification(SV). Most researchers directly replaced handcrafted features with the universal representations of the PTMs, and jointly fine-tuned PTMs with the downstream SV networks, which undoubtedly discarded important spectral information contained in handcrafted features and also increased the training cost. In this paper, we proposed an efficient feature integration method that utilized a Fine-grained Fusion Module to fuse the multi-layer representations of the PTMs adaptively. Then we integrated the fused representations with handcrafted features to obtain the integrated features, which were subsequently fed into the SV network. The experimental results demonstrated that using the integrated features effectively enhanced the performance of the SV systems, and yielded decent results with no need to fine-tune the PTMs. Moreover, employing full-parameter fine-tuning led to the current optimal results.
引用
收藏
页码:2140 / 2144
页数:5
相关论文
共 50 条
  • [1] Semi-supervised speaker verification system based on pre-trained models
    Li, Yishuang
    Chen, Zhicong
    Miao, Shiyu
    Su, Qi
    Li, Lin
    Hong, Qingyang
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2024, 64 (11): : 1936 - 1943
  • [2] PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification
    Zheng, Siqi
    Suo, Hongbin
    Chen, Qian
    INTERSPEECH 2022, 2022, : 1431 - 1435
  • [3] An iVector Extractor Using Pre-trained Neural Networks for Speaker Verification
    Zhang, Shanshan
    Zheng, Rong
    Xu, Bo
    2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 73 - 77
  • [4] Text clustering based on pre-trained models and autoencoders
    Xu, Qiang
    Gu, Hao
    Ji, ShengWei
    FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2024, 17
  • [5] Efficient Key-Based Adversarial Defense for ImageNet by Using Pre-Trained Models
    Maungmaung, Aprilpyone
    Echizen, Isao
    Kiya, Hitoshi
    IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 902 - 913
  • [6] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    ENGINEERING, 2023, 25 : 51 - 65
  • [7] Pre-trained models: Past, present and future
    Han, Xu
    Zhang, Zhengyan
    Ding, Ning
    Gu, Yuxian
    Liu, Xiao
    Huo, Yuqi
    Qiu, Jiezhong
    Yao, Yuan
    Zhang, Ao
    Zhang, Liang
    Han, Wentao
    Huang, Minlie
    Jin, Qin
    Lan, Yanyan
    Liu, Yang
    Liu, Zhiyuan
    Lu, Zhiwu
    Qiu, Xipeng
    Song, Ruihua
    Tang, Jie
    Wen, Ji-Rong
    Yuan, Jinhui
    Zhao, Wayne Xin
    Zhu, Jun
    AI OPEN, 2021, 2 : 225 - 250
  • [8] Natural Attack for Pre-trained Models of Code
    Yang, Zhou
    Shi, Jieke
    He, Junda
    Lo, David
    2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1482 - 1493
  • [9] HinPLMs: Pre-trained Language Models for Hindi
    Huang, Xixuan
    Lin, Nankai
    Li, Kexin
    Wang, Lianxi
    Gan, Suifu
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 241 - 246
  • [10] SASV Based on Pre-trained ASV System and Integrated Scoring Module
    Zhang, Yuxiang
    Li, Zhuo
    Wang, Wenchao
    Zhang, Pengyuan
    INTERSPEECH 2022, 2022, : 4376 - 4380