Weakly Supervised Training of Hierarchical Attention Networks for Speaker Identification

被引:1
|
作者
Shi, Yanpei [1 ]
Huang, Qiang [1 ]
Hain, Thomas [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, Speech & Hearing Res Grp, Sheffield, S Yorkshire, England
来源
基金
“创新英国”项目;
关键词
Weakly Supervised Learning; Speaker Identification; Hierarchical Attention; X-vectors; Attention Mechanism;
D O I
10.21437/Interspeech.2020-1774
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
Identifying multiple speakers without knowing where a speaker's voice is in a recording is a challenging task. In this paper, a hierarchical attention network is proposed to solve a weakly labelled speaker identification problem. The use of a hierarchical structure, consisting of a frame-level encoder and a segment-level encoder, aims to learn speaker related information locally and globally. Speech streams are segmented into fragments. The frame-level encoder with attention learns features and highlights the target related frames locally, and output a fragment based embedding. The segment-level encoder works with a second attention layer to emphasize the fragments probably related to target speakers. The global information is finally collected from segment-level module to predict speakers via a classifier. To evaluate the effectiveness of the proposed approach, artificial datasets based on Switchboard Cellular part1 (SWBC) and Voxceleb1 are constructed in two conditions, where speakers' voices are overlapped and not overlapped. Comparing to two baselines the obtained results show that the proposed approach can achieve better performances. Moreover, further experiments are conducted to evaluate the impact of utterance segmentation. The results show that a reasonable segmentation can slightly improve identification performances.
引用
收藏
页码:2992 / 2996
页数:5
相关论文
共 50 条
  • [1] Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction
    Zhao, Zifeng
    Gu, Rongzhi
    Yang, Dongchao
    Tian, Jinchuan
    Zou, Yuexian
    INTERSPEECH 2022, 2022, : 5318 - 5322
  • [2] Weakly Supervised Attention Networks for Entity Recognition
    Patra, Barun
    Moniz, Joel Ruben Antony
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 6268 - 6273
  • [3] Weakly-Supervised Part-Attention and Mentored Networks for Vehicle Re-Identification
    Tang, Lisha
    Wang, Yi
    Chau, Lap-Pui
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (12) : 8887 - 8898
  • [4] SUPERVISED ATTENTION FOR SPEAKER RECOGNITION
    Kye, Seong Min
    Chung, Joon Son
    Kim, Hoirin
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 286 - 293
  • [5] "Sheldon speaking, bonjour !" - Leveraging Multilingual Tracks for (Weakly) Supervised Speaker Identification
    Bredin, Herve
    Roy, Anindya
    Pecheux, Nicolas
    Allauzen, Alexandre
    PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 137 - 146
  • [6] Weakly Supervised Attention Map Training for Histological Localization of Colonoscopy Images
    Kwon, Jangho
    Choi, Kihwan
    2021 43RD ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE & BIOLOGY SOCIETY (EMBC), 2021, : 3725 - 3728
  • [7] The Weakly Supervised Network of Hierarchical Attention Mechanism for Fine-Grained Classification
    Long, Qian
    Wang, Gaihua
    Qu, Hongwei
    Yao, Jingxuan
    Zhu, Bolun
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT VII, ICIC 2024, 2024, 14868 : 257 - 265
  • [8] Weakly Supervised Extractive Summarization with Attention
    Zhuang, Yingying
    Lu, Yichao
    Wang, Simi
    SIGDIAL 2021: 22ND ANNUAL MEETING OF THE SPECIAL INTEREST GROUP ON DISCOURSE AND DIALOGUE (SIGDIAL 2021), 2021, : 520 - 529
  • [9] Hierarchical speaker identification using speaker clustering
    Sun, B
    Liu, WJ
    Zhong, QH
    2003 INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, PROCEEDINGS, 2003, : 299 - 304
  • [10] Hierarchical graph attention networks for semi-supervised node classification
    Kangjie Li
    Yixiong Feng
    Yicong Gao
    Jian Qiu
    Applied Intelligence, 2020, 50 : 3441 - 3451