MULTI-TASK LEARNING WITH CROSS ATTENTION FOR KEYWORD SPOTTING

被引:3
|
作者
Higuchil, Takuya [1 ]
Gupta, Anmol [2 ]
Dhir, Chandra [1 ]
机构
[1] Apple, Cupertino, CA USA
[2] Univ Hong Kong, Dept Comp Sci, Hong Kong, Peoples R China
关键词
keyword spotting; Transformer; multi-task learning;
D O I
10.1109/ASRU51503.2021.9687967
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Keyword spotting (KWS) is an important technique for speech applications, which enables users to activate devices by speaking a keyword phrase. Although a phoneme classifier can be used for KWS, exploiting a large amount of transcribed data for automatic speech recognition (ASR), there is a mismatch between the training criterion (phoneme recognition) and the target task (KWS). Recently, multi-task learning has been applied to KWS to exploit both ASR and KWS training data. In this approach, an output of an acoustic model is split into two branches for the two tasks, one for phoneme transcription trained with the ASR data and one for keyword classification trained with the KWS data. In this paper, we introduce a cross attention decoder in the multitask learning framework. Unlike the conventional multi-task learning approach with the simple split of the output layer, the cross attention decoder summarizes information from a phonetic encoder by performing cross attention between the encoder outputs and a trainable query sequence to predict a confidence score for the KWS task. Experimental results on KWS tasks show that the proposed approach achieves a 12% relative reduction in the false reject ratios compared to the conventional multi-task learning with split branches and a bi-directional long short-team memory decoder.
引用
收藏
页码:571 / 578
页数:8
相关论文
共 50 条
  • [21] Multi-task Supervised Learning via Cross-learning
    Cervino, Juan
    Andres Bazerque, Juan
    Calvo-Fullana, Miguel
    Ribeiro, Alejandro
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1381 - 1385
  • [22] ORTHOGONALITY CONSTRAINED MULTI-HEAD ATTENTION FOR KEYWORD SPOTTING
    Lee, Mingu
    Lee, Jinkyu
    Jang, Hye Jin
    Kim, Byeonggeun
    Chang, Wonil
    Hwang, Kyuwoong
    2019 IEEE AUTOMATIC SPEECH RECOGNITION AND UNDERSTANDING WORKSHOP (ASRU 2019), 2019, : 86 - 92
  • [23] Multi-task Learning for Paraphrase Generation With Keyword and Part-of-Speech Reconstruction
    Xie, Xuhang
    Lu, Xuesong
    Chen, Bei
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 1234 - 1243
  • [24] Facial Expression Recognition by Regional Attention and Multi-task Learning
    Cui, Longlei
    Tian, Ying
    ENGINEERING LETTERS, 2021, 29 (03) : 919 - 925
  • [25] An efficient multi-task learning CNN for driver attention monitoring
    Yang, Dawei
    Wang, Yan
    Wei, Ran
    Guan, Jiapeng
    Huang, Xiaohua
    Cai, Wei
    Jiang, Zhe
    JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 148
  • [26] Attention-Oriented Deep Multi-Task Hash Learning
    Wang, Letian
    Meng, Ziyu
    Dong, Fei
    Yang, Xiao
    Xi, Xiaoming
    Nie, Xiushan
    ELECTRONICS, 2023, 12 (05)
  • [27] Adversarial Learning for Multi-Task Sequence Labeling With Attention Mechanism
    Wang, Yu
    Li, Yun
    Zhu, Ziye
    Tong, Hanghang
    Huang, Yue
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2476 - 2488
  • [28] Cross-stitch Networks for Multi-task Learning
    Misra, Ishan
    Shrivastava, Abhinav
    Gupta, Abhinav
    Hebert, Martial
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3994 - 4003
  • [29] Dispatched attention with multi-task learning for nested mention recognition
    Fei, Hao
    Ren, Yafeng
    Ji, Donghong
    Information Sciences, 2020, 513 : 241 - 251
  • [30] Contrastive Modules with Temporal Attention for Multi-Task Reinforcement Learning
    Lan, Siming
    Zhang, Rui
    Yi, Qi
    Guo, Jiaming
    Peng, Shaohui
    Gao, Yunkai
    Wu, Fan
    Chen, Ruizhi
    Du, Zidong
    Hu, Xing
    Zhang, Xishan
    Li, Ling
    Chen, Yunji
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,