Efficient Integrated Features Based on Pre-trained Models for Speaker Verification

被引：0

作者：

Li, Yishuang ^{[1
,2
]}

Guan, Wenhao ^{[3
]}

Huang, Hukai ^{[3
]}

Miao, Shiyu ^{[2
]}

Su, Qi ^{[2
]}

Li, Lin ^{[1
,2
]}

Hong, Qingyang ^{[3
]}

机构：

[1] Xiamen Univ, Inst Artificial Intelligence, Xian, Peoples R China

[2] Xiamen Univ, Sch Elect Sci & Engn, Xian, Peoples R China

[3] Xiamen Univ, Sch Informat, Xian, Peoples R China

来源：

INTERSPEECH 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

speaker verification; pre-trained models; feature integration; t-SNE; SPEECH;

D O I：

10.21437/Interspeech.2024-1889

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Previous work has explored the application of pre-trained models (PTMs) in speaker verification(SV). Most researchers directly replaced handcrafted features with the universal representations of the PTMs, and jointly fine-tuned PTMs with the downstream SV networks, which undoubtedly discarded important spectral information contained in handcrafted features and also increased the training cost. In this paper, we proposed an efficient feature integration method that utilized a Fine-grained Fusion Module to fuse the multi-layer representations of the PTMs adaptively. Then we integrated the fused representations with handcrafted features to obtain the integrated features, which were subsequently fed into the SV network. The experimental results demonstrated that using the integrated features effectively enhanced the performance of the SV systems, and yielded decent results with no need to fine-tune the PTMs. Moreover, employing full-parameter fine-tuning led to the current optimal results.

引用

页码：2140 / 2144

页数：5

共 50 条

[31] Speaker Verification based on extraction of Deep Features
Mitsianis, Evangelos
Spyrou, Evaggelos
Giannakopoulos, Theodore
10TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE (SETN 2018), 2018,
[32] Domain knowledge-infused pre-trained deep learning models for efficient white blood cell classification
P. Jeneessha
Vinoth Kumar Balasubramanian
Scientific Reports, 15 (1)
[33] Assessment of Convolutional Neural Network Pre-Trained Models for Detection and Orientation of Cracks
Qayyum, Waqas
Ehtisham, Rana
Bahrami, Alireza
Camp, Charles
Mir, Junaid
Ahmad, Afaq
MATERIALS, 2023, 16 (02)
[34] Exploiting Word Semantics to Enrich Character Representations of Chinese Pre-trained Models
Li, Wenbiao
Sun, Rui
Wu, Yunfang
NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2022, PT I, 2022, 13551 : 3 - 15
[35] Sprelog: Log-Based Anomaly Detection with Self-matching Networks and Pre-trained Models
Yang, Haitian
Zhao, Xuan
Sun, Degang
Wang, Yan
Huang, Weiqing
SERVICE-ORIENTED COMPUTING (ICSOC 2021), 2021, 13121 : 736 - 743
[36] Autoassociator-based models for speaker verification
Gori, M
Lastrucci, L
Soda, G
PATTERN RECOGNITION LETTERS, 1996, 17 (03) : 241 - 250
[37] Comprehensive study of pre-trained language models: detecting humor in news headlines
Farah Shatnawi
Malak Abdullah
Mahmoud Hammad
Mahmoud Al-Ayyoub
Soft Computing, 2023, 27 : 2575 - 2599
[38] SAR Image Despeckling Using Pre-trained Convolutional Neural Network Models
Yang, Xiangli
Denis, Loic
Tupin, Florence
Yang, Wen
2019 JOINT URBAN REMOTE SENSING EVENT (JURSE), 2019,
[39] Pre-trained models for detection and severity level classification of dysarthria from speech
Javanmardi, Farhad
Kadiri, Sudarsana Reddy
Alku, Paavo
SPEECH COMMUNICATION, 2024, 158
[40] Assessing and improving syntactic adversarial robustness of pre-trained models for code translation
Yang, Guang
Zhou, Yu
Zhang, Xiangyu
Chen, Xiang
Han, Tingting
Chen, Taolue
INFORMATION AND SOFTWARE TECHNOLOGY, 2025, 181

← 1 2 3 4 5 →