Knowledge Distillation-Based Representation Learning for Short-Utterance Spoken Language Identification

被引:16
|
作者
Shen, Peng [1 ]
Lu, Xugang [1 ]
Li, Sheng [1 ]
Kawai, Hisashi [1 ]
机构
[1] Natl Inst Informat & Commun Technol, Koganei, Tokyo, Japan
基金
日本学术振兴会;
关键词
Task analysis; Speech processing; Training; Speech recognition; Neural networks; Feature extraction; Robustness; Internal representation learning; knowledge distillation; short utterances; spoken language identification;
D O I
10.1109/TASLP.2020.3023627
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
With successful applications of deep feature learning algorithms, spoken language identification (LID) on long utterances obtains satisfactory performance. However, the performance on short utterances is drastically degraded even when the LID system is trained using short utterances. The main reason is due to the large variation of the representation on short utterances which results in high model confusion. To narrow the performance gap between long, and short utterances, we proposed a teacher-student representation learning framework based on a knowledge distillation method to improve LID performance on short utterances. In the proposed framework, in addition to training the student model on short utterances with their true labels, the internal representation from the output of a hidden layer of the student model is supervised with the representation corresponding to their longer utterances. By reducing the distance of internal representations between short, and long utterances, the student model can explore robust discriminative representations for short utterances, which is expected to reduce model confusion. We conducted experiments on our in-house LID dataset, and NIST LRE07 dataset, and showed the effectiveness of the proposed methods for short utterance LID tasks.
引用
收藏
页码:2674 / 2683
页数:10
相关论文
共 50 条
  • [41] Knowledge distillation-based information sharing for online process monitoring in decentralized manufacturing system
    Shi, Zhangyue
    Li, Yuxuan
    Liu, Chenang
    JOURNAL OF INTELLIGENT MANUFACTURING, 2025, 36 (03) : 2177 - 2192
  • [42] A Knowledge Distillation-Based Ground Feature Classification Network With Multiscale Feature Fusion in Remote-Sensing Images
    Yang, Yang
    Wang, Yanhui
    Dong, Junwu
    Yu, Bibo
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 2347 - 2359
  • [43] A review into deep learning techniques for spoken language identification
    Irshad Ahmad Thukroo
    Rumaan Bashir
    Kaiser J. Giri
    Multimedia Tools and Applications, 2022, 81 : 32593 - 32624
  • [44] Transducer-based language embedding for spoken language identification
    Shen, Peng
    Lu, Xugang
    Kawai, Hisashi
    INTERSPEECH 2022, 2022, : 3724 - 3728
  • [45] Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding
    Cappellazzo, Umberto
    Yang, Muqiao
    Falavigna, Daniele
    Brutti, Alessio
    INTERSPEECH 2023, 2023, : 2953 - 2957
  • [46] Anatomical Landmark Detection Using a Feature-Sharing Knowledge Distillation-Based Neural Network
    Huang, Di
    Wang, Yuzhao
    Wang, Yu
    Gu, Guishan
    Bai, Tian
    ELECTRONICS, 2022, 11 (15)
  • [47] Adaptive Knowledge Distillation-Based Lightweight Intelligent Fault Diagnosis Framework in IoT Edge Computing
    Wang, Yanzhi
    Yu, Ziyang
    Wu, Jinhong
    Wang, Chu
    Zhou, Qi
    Hu, Jiexiang
    IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (13): : 23156 - 23169
  • [48] FedUA: An Uncertainty-Aware Distillation-Based Federated Learning Scheme for Image Classification
    Lee, Shao-Ming
    Wu, Ja-Ling
    INFORMATION, 2023, 14 (04)
  • [49] KD-PAR: A knowledge distillation-based pedestrian attribute recognition model with multi-label mixed feature learning network
    Wu, Peishu
    Wang, Zidong
    Li, Han
    Zeng, Nianyin
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [50] Teaching Where to See: Knowledge Distillation-Based Attentive Information Transfer in Vehicle Maker Classification
    Lee, Yunsoo
    Ahn, Namhyun
    Heo, Jun Ho
    Jo, So Yeon
    Kang, Suk-Ju
    IEEE ACCESS, 2019, 7 : 86412 - 86420