Speaker recognition using isomorphic graph attention network based pooling on self-supervised representation *

被引:1
|
作者
Ge, Zirui [1 ]
Xu, Xinzhou [2 ]
Guo, Haiyan [1 ]
Wang, Tingting [1 ]
Yang, Zhen [1 ]
机构
[1] Nanjing Univ Posts & Telecommun, Sch Commun & Informat Engn, Nanjing 2100023, Jiangsu, Peoples R China
[2] Nanjing Univ Posts & Telecommun, Sch Internet Things, Nanjing 2100023, Jiangsu, Peoples R China
基金
中国国家自然科学基金; 中国博士后科学基金;
关键词
Speaker recognition; Self-supervised representation; Isomorphic graph attention network; Pooling; ANGULAR MARGIN LOSS;
D O I
10.1016/j.apacoust.2024.109929
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The emergence of self -supervised representation (i.e., wav2vec 2.0) allows speaker -recognition approaches to process spoken signals through foundation models built on speech data. Nevertheless, effective fusion on the representation requires further investigating, due to the inclusion of fixed or sub -optimal temporal pooling strategies. Despite of improved strategies considering graph learning and graph attention factors, non-injective aggregation still exists in the approaches, which may influence the performance for speaker recognition. In this regard, we propose a speaker recognition approach using Isomorphic Graph ATtention network (IsoGAT) on self -supervised representation. The proposed approach contains three modules of representation learning, graph attention, and aggregation, jointly considering learning on the self -supervised representation and the IsoGAT. Then, we perform experiments for speaker recognition tasks on VoxCeleb1&2 datasets, with the corresponding experimental results demonstrating the recognition performance for the proposed approach, compared with existing pooling approaches on the self -supervised representation.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Attribute Graph Clustering Based on Self-Supervised Spectral Embedding Network
    Ning, Xiaolin
    Zhao, Xueyi
    Fu, Yanyun
    Tang, Guoyang
    IEEE ACCESS, 2023, 11 : 127715 - 127724
  • [42] SFT-SGAT: A semi-supervised fine-tuning self-supervised graph attention network for emotion recognition and consciousness detection
    Qiu, Lina
    Zhong, Liangquan
    Li, Jianping
    Feng, Weisen
    Zhou, Chengju
    Pan, Jiahui
    NEURAL NETWORKS, 2024, 180
  • [43] Adversarial Speaker Disentanglement Using Unannotated External Data for Self-supervised Representation-based Voice Conversion
    Zhao, Xintao
    Wang, Shuai
    Chao, Yang
    Wu, Zhiyong
    Meng, Helen
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 1691 - 1696
  • [44] Self-Supervised Representation Learning via Latent Graph Prediction
    Xie, Yaochen
    Xu, Zhao
    Ji, Shuiwang
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [45] Contextual features online prediction for self-supervised graph representation
    Duan, Haoran
    Xie, Cheng
    Tang, Peng
    Yu, Beibei
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [46] Self-Supervised Graph Representation Learning via Topology Transformations
    Gao, Xiang
    Hu, Wei
    Qi, Guo-Jun
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2023, 35 (04) : 4202 - 4215
  • [47] Self-Supervised Contrastive Attributed Graph Joint Representation Clustering
    Wang, Jinghong
    Wang, Hui
    Computer Engineering and Applications, 2024, 60 (16) : 133 - 142
  • [48] Generative Subgraph Contrast for Self-Supervised Graph Representation Learning
    Han, Yuehui
    Hui, Le
    Jiang, Haobo
    Qian, Jianjun
    Xie, Jin
    COMPUTER VISION - ECCV 2022, PT XXX, 2022, 13690 : 91 - 107
  • [49] Self-Supervised Graph Representation Learning via Information Bottleneck
    Gu, Junhua
    Zheng, Zichen
    Zhou, Wenmiao
    Zhang, Yajuan
    Lu, Zhengjun
    Yang, Liang
    SYMMETRY-BASEL, 2022, 14 (04):
  • [50] Efficient self-supervised heterogeneous graph representation learning with reconstruction
    Mo, Yujie
    Shen, Heng Tao
    Zhu, Xiaofeng
    INFORMATION FUSION, 2025, 117