Few-Shot Speaker Identification Using Lightweight Prototypical Network With Feature Grouping and Interaction

被引:4
|
作者
Li, Yanxiong [1 ]
Chen, Hao [1 ]
Cao, Wenchang [1 ]
Huang, Qisheng [1 ]
He, Qianhua [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510640, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature grouping; feature interaction; few-shot learning; prototypical network; speaker identification; RECOGNITION; VERIFICATION; ATTENTION;
D O I
10.1109/TMM.2023.3253301
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Existing methods for few-shot speaker identification (FSSI) obtain high accuracy, but their computational complexities and model sizes need to be reduced for lightweight applications. In this work, we propose a FSSI method using a lightweight prototypical network with the final goal to implement the FSSI on intelligent terminals with limited resources, such as smart watches and smart speakers. In the proposed prototypical network, an embedding module is designed to perform feature grouping for reducing the memory requirement and computational complexity, and feature interaction for enhancing the representational ability of the learned speaker embedding. In the proposed embedding module, audio feature of each speech sample is split into several low-dimensional feature subsets that are transformed by a recurrent convolutional block in parallel. Then, the operations of averaging, addition, concatenation, element-wise summation and statistics pooling are sequentially executed to learn a speaker embedding for each speech sample. The recurrent convolutional block consists of a block of bidirectional long short-term memory, and a block of de-redundancy convolution in which feature grouping and interaction are conducted too. Our method is compared to baseline methods on three datasets that are selected from three public speech corpora (VoxCeleb1, VoxCeleb2, and LibriSpeech). The results show that our method obtains higher accuracy under several conditions, and has advantages over all baseline methods in computational complexity and model size.
引用
收藏
页码:9241 / 9253
页数:13
相关论文
共 50 条
  • [41] Few-shot short utterance speaker verification using meta-learning
    Wang, Weijie
    Zhao, Hong
    Yang, Yikun
    Chang, YouKang
    You, Haojie
    PEERJ COMPUTER SCIENCE, 2023, 9
  • [42] Lite-FENet: Lightweight multi-scale feature enrichment network for few-shot segmentation
    Li, Qun
    Sun, Baoquan
    Bhanu, Bir
    KNOWLEDGE-BASED SYSTEMS, 2023, 278
  • [43] ProtoCF: Prototypical Collaborative Filtering for Few-shot Recommendation
    Sankar, Aravind
    Wang, Junting
    Krishnan, Adit
    Sundaram, Hari
    15TH ACM CONFERENCE ON RECOMMENDER SYSTEMS (RECSYS 2021), 2021, : 166 - 175
  • [44] Local feature graph neural network for few-shot learning
    Weng P.
    Dong S.
    Ren L.
    Zou K.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (04) : 4343 - 4354
  • [45] CPCL: Conceptual prototypical contrastive learning for Few-Shot text classification
    Cheng, Tao
    Cheng, Hua
    Fang, Yiquan
    Liu, Yufei
    Gao, Caiting
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (06) : 11963 - 11975
  • [46] Hybrid feature enhancement network for few-shot semantic segmentation
    Min, Hai
    Zhang, Yemao
    Zhao, Yang
    Jia, Wei
    Lei, Yingke
    Fan, Chunxiao
    PATTERN RECOGNITION, 2023, 137
  • [47] Multi-scale feature network for few-shot learning
    Mengya Han
    Ronggui Wang
    Juan Yang
    Lixia Xue
    Min Hu
    Multimedia Tools and Applications, 2020, 79 : 11617 - 11637
  • [48] Unified feature learning network for few-shot fault diagnosis
    Xu, Yan
    Ma, Xinyao
    Wang, Xuan
    Wang, Jinjia
    Tang, Gang
    Ji, Zhong
    NEUROCOMPUTING, 2024, 598
  • [49] Multi-scale feature network for few-shot learning
    Han, Mengya
    Wang, Ronggui
    Yang, Juan
    Xue, Lixia
    Hu, Min
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (17-18) : 11617 - 11637
  • [50] Prior Guided Feature Enrichment Network for Few-Shot Segmentation
    Tian, Zhuotao
    Zhao, Hengshuang
    Shu, Michelle
    Yang, Zhicheng
    Li, Ruiyu
    Jia, Jiaya
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 1050 - 1065