Efficient Integrated Features Based on Pre-trained Models for Speaker Verification

被引：0

作者：

Li, Yishuang ^{[1
,2
]}

Guan, Wenhao ^{[3
]}

Huang, Hukai ^{[3
]}

Miao, Shiyu ^{[2
]}

Su, Qi ^{[2
]}

Li, Lin ^{[1
,2
]}

Hong, Qingyang ^{[3
]}

机构：

[1] Xiamen Univ, Inst Artificial Intelligence, Xian, Peoples R China

[2] Xiamen Univ, Sch Elect Sci & Engn, Xian, Peoples R China

[3] Xiamen Univ, Sch Informat, Xian, Peoples R China

来源：

INTERSPEECH 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

speaker verification; pre-trained models; feature integration; t-SNE; SPEECH;

D O I：

10.21437/Interspeech.2024-1889

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previous work has explored the application of pre-trained models (PTMs) in speaker verification(SV). Most researchers directly replaced handcrafted features with the universal representations of the PTMs, and jointly fine-tuned PTMs with the downstream SV networks, which undoubtedly discarded important spectral information contained in handcrafted features and also increased the training cost. In this paper, we proposed an efficient feature integration method that utilized a Fine-grained Fusion Module to fuse the multi-layer representations of the PTMs adaptively. Then we integrated the fused representations with handcrafted features to obtain the integrated features, which were subsequently fed into the SV network. The experimental results demonstrated that using the integrated features effectively enhanced the performance of the SV systems, and yielded decent results with no need to fine-tune the PTMs. Moreover, employing full-parameter fine-tuning led to the current optimal results.

引用

页码：2140 / 2144

页数：5

共 50 条

[1] Semi-supervised speaker verification system based on pre-trained models
Li, Yishuang
Chen, Zhicong
Miao, Shiyu
Su, Qi
Li, Lin
Hong, Qingyang
Qinghua Daxue Xuebao/Journal of Tsinghua University, 2024, 64 (11): : 1936 - 1943
[2] PRISM: Pre-trained Indeterminate Speaker Representation Model for Speaker Diarization and Speaker Verification
Zheng, Siqi
Suo, Hongbin
Chen, Qian
INTERSPEECH 2022, 2022, : 1431 - 1435
[3] An iVector Extractor Using Pre-trained Neural Networks for Speaker Verification
Zhang, Shanshan
Zheng, Rong
Xu, Bo
2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2014, : 73 - 77
[4] Text clustering based on pre-trained models and autoencoders
Xu, Qiang
Gu, Hao
Ji, ShengWei
FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2024, 17
[5] Efficient Key-Based Adversarial Defense for ImageNet by Using Pre-Trained Models
Maungmaung, Aprilpyone
Echizen, Isao
Kiya, Hitoshi
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2024, 5 : 902 - 913
[6] Pre-Trained Language Models and Their Applications
Wang, Haifeng
Li, Jiwei
Wu, Hua
Hovy, Eduard
Sun, Yu
ENGINEERING, 2023, 25 : 51 - 65
[7] Pre-trained models: Past, present and future
Han, Xu
Zhang, Zhengyan
Ding, Ning
Gu, Yuxian
Liu, Xiao
Huo, Yuqi
Qiu, Jiezhong
Yao, Yuan
Zhang, Ao
Zhang, Liang
Han, Wentao
Huang, Minlie
Jin, Qin
Lan, Yanyan
Liu, Yang
Liu, Zhiyuan
Lu, Zhiwu
Qiu, Xipeng
Song, Ruihua
Tang, Jie
Wen, Ji-Rong
Yuan, Jinhui
Zhao, Wayne Xin
Zhu, Jun
AI OPEN, 2021, 2 : 225 - 250
[8] Natural Attack for Pre-trained Models of Code
Yang, Zhou
Shi, Jieke
He, Junda
Lo, David
2022 ACM/IEEE 44TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING (ICSE 2022), 2022, : 1482 - 1493
[9] HinPLMs: Pre-trained Language Models for Hindi
Huang, Xixuan
Lin, Nankai
Li, Kexin
Wang, Lianxi
Gan, Suifu
2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 241 - 246
[10] SASV Based on Pre-trained ASV System and Integrated Scoring Module
Zhang, Yuxiang
Li, Zhuo
Wang, Wenchao
Zhang, Pengyuan
INTERSPEECH 2022, 2022, : 4376 - 4380

← 1 2 3 4 5 →