Re-weighted Interval Loss for Handling Data Imbalance Problem of End-to-End Keyword Spotting

被引:6
|
作者
Zhang, Kun [1 ]
Wu, Zhiyong [1 ,2 ,3 ]
Yuan, Daode [4 ]
Luan, Jian [4 ]
Jia, Jia [1 ,2 ]
Meng, Helen [1 ,3 ]
Song, Binheng [1 ]
机构
[1] Tsinghua Univ, Grad Sch Shenzhen, Tsinghua CUHK Joint Res Ctr Media Sci Technol & S, Shenzhen, Peoples R China
[2] Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol BNRis, Dept Comp Sci & Technol, Beijing, Peoples R China
[3] Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R China
[4] Microsoft, Xiaoice, Beijing, Peoples R China
来源
INTERSPEECH 2020 | 2020年
基金
中国国家自然科学基金;
关键词
keyword spotting; end-to-end; data imbalance; re-weighting; speech recognition; FOCAL LOSS;
D O I
10.21437/Interspeech.2020-1644
中图分类号
R36 [病理学]; R76 [耳鼻咽喉科学];
学科分类号
100104 ; 100213 ;
摘要
The training process of end-to-end keyword spotting (KWS) suffers from critical data imbalance problem that positive samples are far less than negative samples where different negative samples are not of equal importances. During decoding, false alarms are mainly caused by a small number of important negative samples having pronunciation similar to the keyword; however, the training loss is dominated by the majority of negative samples whose pronunciation is not related to the keyword, called unimportant negative samples. This inconsistency greatly degrades the performance of KWS and existing methods like focal loss don't discriminate between the two kinds of negative samples. To deal with the problem, we propose a novel re-weighted interval loss to re-weight sample loss considering the performance of the classifier over local interval of negative utterance, which automatically down-weights the losses of unimportant negative samples and focuses training on important negative samples that are prone to produce false alarms during decoding. Evaluations on Hey Snips dataset demonstrate that our approach has yielded a superior performance over focal loss baseline with 34% (@0.5 false alarm per hour) relative reduction of false reject rate.
引用
收藏
页码:2567 / 2571
页数:5
相关论文
共 13 条
  • [1] A Novel Re-weighted CTC Loss for Data Imbalance in Speech Keyword Spotting
    Lan Xiaotian
    He Qianhua
    Yan Haikang
    Li Yanxiong
    CHINESE JOURNAL OF ELECTRONICS, 2023, 32 (03) : 465 - 473
  • [2] END-TO-END STREAMING KEYWORD SPOTTING
    Alvarez, Raziel
    Park, Hyun-Jin
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 6336 - 6340
  • [3] Metadata-Aware End-to-End Keyword Spotting
    Liu, Hongyi
    Abhyankar, Apurva
    Mishchenko, Yuriy
    Senechal, Thibaud
    Fu, Gengshen
    Kulis, Brian
    Stein, Noah
    Shah, Anish
    Vitaladevuni, Shiv Naga Prasad
    INTERSPEECH 2020, 2020, : 2282 - 2286
  • [4] End-to-End Multi-Look Keyword Spotting
    Yu, Meng
    Ji, Xuan
    Wu, Bo
    Su, Dan
    Yu, Dong
    INTERSPEECH 2020, 2020, : 66 - 70
  • [5] AN END-TO-END FAR-FIELD KEYWORD SPOTTING SYSTEM WITH NEURAL BEAMFORMING
    Ji, Xuan
    Lu, Lu
    Fang, Fuming
    Ma, Jianbo
    Zhu, Lei
    Li, Jinke
    Zhao, Dongdi
    Liu, Ming
    Jiang, Feijun
    2021 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU), 2021, : 892 - 899
  • [6] An End-to-End Model Based on TDNN-BiGRU for Keyword Spotting
    Chai, Shuzhou
    Yang, Zhenye
    Lv, Changsheng
    Zhang, Wei-Qiang
    PROCEEDINGS OF THE 2019 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2019, : 402 - 406
  • [7] END-TO-END KEYWORD SPOTTING USING NEURAL ARCHITECTURE SEARCH AND QUANTIZATION
    Peter, David
    Roth, Wolfgang
    Pernkopf, Franz
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3423 - 3427
  • [8] END-TO-END LOW RESOURCE KEYWORD SPOTTING THROUGH CHARACTER RECOGNITION AND BEAM-SEARCH RE-SCORING
    Mekonnen, Ephrem Tibebe
    Brutti, Alessio
    Falavigna, Daniele
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8182 - 8186
  • [9] End-to-End Speech Keyword Spotting Training Method Based on Sample's Class Uncertainty
    He, Qian-Hua
    Chen, Yong-Qiang
    Zheng, Ruo-Wei
    Huang, Jin-Xin
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2024, 52 (10): : 3482 - 3492
  • [10] End-to-End Transformer-Based Open-Vocabulary Keyword Spotting with Location-Guided Local Attention
    Wei, Bo
    Yang, Meirong
    Zhang, Tao
    Tang, Xiao
    Huang, Xing
    Kim, Kyuhong
    Lee, Jaeyun
    Cho, Kiho
    Park, Sung-Un
    INTERSPEECH 2021, 2021, : 361 - 365