VIC-KD: VARIANCE-INVARIANCE-COVARIANCE KNOWLEDGE DISTILLATION TO MAKE KEYWORD SPOTTING MORE ROBUST AGAINST ADVERSARIAL ATTACKS

被引:0
作者
Guimaraes, Heitor R. [1 ]
Pimentel, Arthur
Avila, Anderson
Falk, Tiago H.
机构
[1] Univ Quebec, Inst Natl Rech Sci INRS EMT, Montreal, PQ, Canada
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2024) | 2024年
关键词
Keyword Spotting; Adversarial Robustness; Knowledge Distillation; Robust Distillation; VICReg;
D O I
10.1109/ICASSP48485.2024.10447992
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Keyword spotting (KWS) refers to the task of identifying a set of predefined words in audio streams. With the advances seen recently with deep neural networks, it has become a popular technology to activate and control small devices, such as voice assistants. Relying on such models for edge devices, however, can be challenging due to hardware constraints. Moreover, as adversarial attacks have increased against voice-based technologies, developing solutions robust to such attacks has become crucial. In this work, we propose VIC-KD, a robust distillation recipe for model compression and adversarial robustness. Using self-supervised speech representations, we show that imposing geometric priors to the latent representations of both Teacher and Student models leads to more robust target models. Experiments on the Google Speech Commands datasets show that the proposed methodology improves upon current state-of-the-art robust distillation methods, such as ARD and RSLAD, by 12% and 8% in robust accuracy, respectively.
引用
收藏
页码:12196 / 12200
页数:5
相关论文
共 23 条
[1]  
[Anonymous], PR MACH LEARN RES
[2]  
Avila AR, 2017, IEEE INT SYMP SIGNAL, P360, DOI 10.1109/ISSPIT.2017.8388669
[3]  
Baevski A, 2020, ADV NEUR IN, V33
[4]  
Bardes A., 2022, PROC 10 INT C LEARN
[5]  
Chen GG, 2014, INT CONF ACOUST SPEE
[6]  
Chen Sanyuan, 2022, IEEE Journal of Selected Topics in Signal Processing (JSTSP)
[7]   Temporal Convolution for Real-time Keyword Spotting on Mobile Devices [J].
Choi, Seungwoo ;
Seo, Seokjun ;
Shin, Beomjun ;
Byun, Hyeongmin ;
Kersner, Martin ;
Kim, Beomsu ;
Kim, Dongyoung ;
Ha, Sungjoo .
INTERSPEECH 2019, 2019, :3372-3376
[8]  
Croce F, 2020, PR MACH LEARN RES, V119
[9]  
Goldblum M, 2020, AAAI CONF ARTIF INTE, V34, P3996
[10]  
Grill Jean-Bastien., 2020, Proc. Adv. Neural Inf. Process. Syst, P21271, DOI DOI 10.48550/ARXIV.2006.07733