Knowledge Distillation Approach for Efficient Internal Language Model Estimation

被引:0
|
作者
Chen, Zhipeng [1 ]
Xu, Haihua [1 ]
Khassanov, Yerbolat [1 ]
He, Yi [1 ]
Lu, Lu [1 ]
Ma, Zejun [1 ]
Wu, Ji [2 ]
机构
[1] ByteDance, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Elect Engn, Beijing, Peoples R China
来源
INTERSPEECH 2023 | 2023年
关键词
ASR; language model; ILME; density ratio; knowledge distillation; efficiency; ASR;
D O I
10.21437/Interspeech.2023-2479
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Internal language model estimation (ILME) has demonstrated its efficacy in domain adaptation for end-to-end (E2E) ASR. However, the performance improvement is achieved at the expense of computational cost, compared with conventional shallow fusion. To estimate the internal language model prior, one should run an extra forward operation on either ASR decoder or a separate density ratio (DR) language model (LM) for each decoding utterance. In this paper, we propose to employ knowledge distillation (KD) approach to realize efficient ILME for the Listen-Attend-Spell (LAS) E2E ASR model. First, we extensively explore diverse ILME and DR methods. We find that the ILM can be approximated with a DR-LM much smaller than the original ASR decoder. Furthermore, to reach the performance of ILME, we propose to employ the estimated ILM as teacher to teach a small DR-LM by KD. In this way, we achieve the best of both worlds: comparable performance to ILME and high efficiency of DR with a small DR-LM.
引用
收藏
页码:1339 / 1343
页数:5
相关论文
共 50 条
  • [31] A Novel Multi-Knowledge Distillation Approach
    Li, Lianqiang
    Sun, Kangbo
    Zhu, Jie
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (01) : 216 - 219
  • [32] Knowledge distillation approach towards melanoma detection
    Khan, Md Shakib
    Alam, Kazi Nabiul
    Dhruba, Abdur Rab
    Zunair, Hasib
    Mohammed, Nabeel
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 146
  • [33] Resolution-Aware Knowledge Distillation for Efficient Inference
    Feng, Zhanxiang
    Lai, Jianhuang
    Xie, Xiaohua
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 6985 - 6996
  • [34] Xai-driven knowledge distillation of large language models for efficient deployment on low-resource devices
    Cantini, Riccardo
    Orsino, Alessio
    Talia, Domenico
    JOURNAL OF BIG DATA, 2024, 11 (01)
  • [35] TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation
    Liu, Ruiping
    Yang, Kailun
    Roitberg, Alina
    Zhang, Jiaming
    Peng, Kunyu
    Liu, Huayao
    Wang, Yaonan
    Stiefelhagen, Rainer
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (12) : 20933 - 20949
  • [36] Efficient Water Segmentation with Transformer and Knowledge Distillation for USVs
    Zhang, Jingting
    Gao, Jiantao
    Liang, Jinshuo
    Wu, Yiqiang
    Li, Bin
    Zhai, Yang
    Li, Xiaomao
    JOURNAL OF MARINE SCIENCE AND ENGINEERING, 2023, 11 (05)
  • [37] AN EFFICIENT PERSON REID METHED BASED ON KNOWLEDGE DISTILLATION
    Peng, Liang
    Kuang, Ping
    Li, Fan
    Gu, Xiaofeng
    2019 16TH INTERNATIONAL COMPUTER CONFERENCE ON WAVELET ACTIVE MEDIA TECHNOLOGY AND INFORMATION PROCESSING (ICWAMTIP), 2019, : 13 - 16
  • [38] Structured Knowledge Distillation for Accurate and Efficient Object Detection
    Zhang, Linfeng
    Ma, Kaisheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (12) : 15706 - 15724
  • [39] Efficient Medical Image Segmentation Based on Knowledge Distillation
    Qin, Dian
    Bu, Jia-Jun
    Liu, Zhe
    Shen, Xin
    Zhou, Sheng
    Gu, Jing-Jun
    Wang, Zhi-Hua
    Wu, Lei
    Dai, Hui-Fen
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2021, 40 (12) : 3820 - 3831
  • [40] EFFICIENT KNOWLEDGE DISTILLATION FOR RNN-TRANSDUCER MODELS
    Panchapagesan, Sankaran
    Park, Daniel S.
    Chiu, Chung-Cheng
    Yuan Shangguan
    Qiao Liang
    Gruenstein, Alexander
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 5639 - 5643