MULTI-TASK LEARNING WITH CROSS ATTENTION FOR KEYWORD SPOTTING

被引:3
|
作者
Higuchil, Takuya [1 ]
Gupta, Anmol [2 ]
Dhir, Chandra [1 ]
机构
[1] Apple, Cupertino, CA USA
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
关键词
keyword spotting; Transformer; multi-task learning;
D O I
10.1109/ASRU51503.2021.9687967
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyword spotting (KWS) is an important technique for speech applications, which enables users to activate devices by speaking a keyword phrase. Although a phoneme classifier can be used for KWS, exploiting a large amount of transcribed data for automatic speech recognition (ASR), there is a mismatch between the training criterion (phoneme recognition) and the target task (KWS). Recently, multi-task learning has been applied to KWS to exploit both ASR and KWS training data. In this approach, an output of an acoustic model is split into two branches for the two tasks, one for phoneme transcription trained with the ASR data and one for keyword classification trained with the KWS data. In this paper, we introduce a cross attention decoder in the multitask learning framework. Unlike the conventional multi-task learning approach with the simple split of the output layer, the cross attention decoder summarizes information from a phonetic encoder by performing cross attention between the encoder outputs and a trainable query sequence to predict a confidence score for the KWS task. Experimental results on KWS tasks show that the proposed approach achieves a 12% relative reduction in the false reject ratios compared to the conventional multi-task learning with split branches and a bi-directional long short-team memory decoder.
引用
收藏
页码:571 / 578
页数:8
相关论文
共 50 条
  • [41] Multi-Task Reinforcement Learning With Attention-Based Mixture of Experts
    Cheng, Guangran
    Dong, Lu
    Cai, Wenzhe
    Sun, Changyin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (06) : 3811 - 3818
  • [42] Multi-task learning with contextual hierarchical attention for Korean coreference resolution
    Park, Cheoneum
    ETRI JOURNAL, 2023, 45 (01) : 93 - 104
  • [43] Deep multi-task learning with relational attention for business success prediction
    Zhao, Jiejie
    Du, Bowen
    Sun, Leilei
    Lv, Weifeng
    Liu, Yanchi
    Xiong, Hui
    PATTERN RECOGNITION, 2021, 110
  • [44] MTFormer: Multi-task Learning via Transformer and Cross-Task Reasoning
    Xu, Xiaogang
    Zhao, Hengshuang
    Vineet, Vibhav
    Lim, Ser-Nam
    Torralba, Antonio
    COMPUTER VISION - ECCV 2022, PT XXVII, 2022, 13687 : 304 - 321
  • [45] Interpretable Multi-Task Learning for Product Quality Prediction with Attention Mechanism
    Yeh, Cheng-Han
    Fan, Yao-Chung
    Peng, Wen-Chih
    2019 IEEE 35TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2019), 2019, : 1910 - 1921
  • [46] Multi-Task Learning for Cross-Lingual Abstractive Summarization
    Takase, Sho
    Okazaki, Naoaki
    2022 Language Resources and Evaluation Conference, LREC 2022, 2022, : 3008 - 3016
  • [47] Multi-Task Learning for Cross-Lingual Abstractive Summarization
    Takase, Sho
    Okazaki, Naoaki
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 3008 - 3016
  • [48] Learning to Branch for Multi-Task Learning
    Guo, Pengsheng
    Lee, Chen-Yu
    Ulbricht, Daniel
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [49] Learning to Branch for Multi-Task Learning
    Guo, Pengsheng
    Lee, Chen-Yu
    Ulbricht, Daniel
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [50] Boosted multi-task learning
    Olivier Chapelle
    Pannagadatta Shivaswamy
    Srinivas Vadrevu
    Kilian Weinberger
    Ya Zhang
    Belle Tseng
    Machine Learning, 2011, 85 : 149 - 173