Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification

被引:16
|
作者
Shen, Peng [1 ]
Lu, Xugang [1 ]
Li, Sheng [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Koganei, Tokyo, Japan
基金
日本学术振兴会;
关键词
Task analysis; Speech processing; Training; Speech recognition; Neural networks; Feature extraction; Robustness; Internal representation learning; knowledge distillation; short utterances; spoken language identification;
D O I
10.1109/TASLP.2020.3023627
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With successful applications of deep feature learning algorithms, spoken language identification (LID) on long utterances obtains satisfactory performance. However, the performance on short utterances is drastically degraded even when the LID system is trained using short utterances. The main reason is due to the large variation of the representation on short utterances which results in high model confusion. To narrow the performance gap between long, and short utterances, we proposed a teacher-student representation learning framework based on a knowledge distillation method to improve LID performance on short utterances. In the proposed framework, in addition to training the student model on short utterances with their true labels, the internal representation from the output of a hidden layer of the student model is supervised with the representation corresponding to their longer utterances. By reducing the distance of internal representations between short, and long utterances, the student model can explore robust discriminative representations for short utterances, which is expected to reduce model confusion. We conducted experiments on our in-house LID dataset, and NIST LRE07 dataset, and showed the effectiveness of the proposed methods for short utterance LID tasks.
引用
收藏
页码:2674 / 2683
页数:10
相关论文
共 50 条
  • [31] KD-INR: Time-Varying Volumetric Data Compression via Knowledge Distillation-Based Implicit Neural Representation
    Han, Jun
    Zheng, Hao
    Bi, Chongke
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2024, 30 (10) : 6826 - 6838
  • [32] Knowledge Distillation For CTC-based Speech Recognition Via Consistent Acoustic Representation Learning
    Tian, Sanli
    Deng, Keqi
    Li, Zehan
    Ye, Lingxuan
    Cheng, Gaofeng
    Li, Ta
    Yan, Yonghong
    INTERSPEECH 2022, 2022, : 2633 - 2637
  • [33] Knowledge Distillation-Based Robust UAV Swarm Communication Under Malicious Attacks
    Wu, Qirui
    Zhang, Yirun
    Yang, Zhaohui
    Shikh-Bahaei, Mohammad
    2024 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS WORKSHOPS, ICC WORKSHOPS 2024, 2024, : 1023 - 1029
  • [34] Facial landmark points detection using knowledge distillation-based neural networks
    Fard, Ali Pourramezan
    Mahoor, Mohammad H.
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 215
  • [35] Toward Extremely Lightweight Distracted Driver Recognition With Distillation-Based Neural Architecture Search and Knowledge Transfer
    Liu, Dichao
    Yamasaki, Toshihiko
    Wang, Yu
    Mase, Kenji
    Kato, Jien
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (01) : 764 - 777
  • [36] Applying feature normalization based on pole filtering to short-utterance speech recognition using deep neural network
    Han, Jaemin
    Kim, Min Sik
    Kim, Hyung Soon
    JOURNAL OF THE ACOUSTICAL SOCIETY OF KOREA, 2020, 39 (01): : 64 - 68
  • [37] MUFTI: Multi-Domain Distillation-Based Heterogeneous Federated Continuous Learning
    Gai, Keke
    Wang, Zijun
    Yu, Jing
    Zhu, Liehuang
    IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 2721 - 2733
  • [38] DURATION-NORMALIZED FEATURE SELECTION FOR INDIAN SPOKEN LANGUAGE IDENTIFICATION IN UTTERANCE LENGTH MISMATCH
    Bakshi, Aarti M.
    Kopparapu, Sunil K.
    JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2022, 17 (03): : 2120 - 2134
  • [39] Effective Intrusion Detection in Heterogeneous Internet-of-Things Networks via Ensemble Knowledge Distillation-based Federated Learning
    Shen, Jiyuan
    Yang, Wenzhuo
    Chu, Zhaowei
    Fan, Jiani
    Niyato, Dusit
    Lam, Kwok-Yan
    ICC 2024 - IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2024, : 2034 - 2039
  • [40] Representation Learning and Knowledge Distillation for Lightweight Domain Adaptation
    Bin Shah, Sayed Rafay
    Putty, Shreyas Subhash
    Schwung, Andreas
    2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024, 2024, : 1202 - 1207