CROSS-DOMAIN SEMI-SUPERVISED AUDIO EVENT CLASSIFICATION USING CONTRASTIVE REGULARIZATION

被引:2
作者
Lee, Donmoon [1 ,2 ]
Lee, Kyogu [1 ,3 ]
机构
[1] Seoul Natl Univ, Dept Intelligence & Informat, Mus & Res Grp, Seoul, South Korea
[2] Cochlear Ai, Seoul, South Korea
[3] Seoul Natl Univ, Artificial Intelligence Inst, Seoul, South Korea
来源
2021 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA) | 2021年
关键词
Audio event classification; semi-supervised learning; contrastive learning;
D O I
10.1109/WASPAA52581.2021.9632721
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this study, we proposed a novel semi-supervised training method that uses unlabeled data with a class distribution that is completely different from the target data or data without a target label. To this end, we introduce a contrastive regularization that is designed to be target task-oriented and trained simultaneously. In addition, we propose an audio mixing based simple augmentation strategy that performed in batch samples. Experimental results validate that the proposed method successfully contributed to the performance improvement, and particularly showed that it has advantages in stable training and generalization.
引用
收藏
页码:196 / 200
页数:5
相关论文
共 30 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]  
Berthelot D, 2019, ADV NEUR IN, V32
[3]  
Bromley J., 1993, International Journal of Pattern Recognition and Artificial Intelligence, V7, P669, DOI 10.1142/S0218001493000339
[4]  
Cances L., 2021, IEEE 46 INT C AC SPE
[5]   Multitask learning [J].
Caruana, R .
MACHINE LEARNING, 1997, 28 (01) :41-75
[6]   NEURAL AUDIO FINGERPRINT FOR HIGH-SPECIFIC AUDIO RETRIEVAL BASED ON CONTRASTIVE LEARNING [J].
Chang, Sungkyun ;
Lee, Donmoon ;
Park, Jeongsoo ;
Lim, Hyungui ;
Lee, Kyogu ;
Ko, Karam ;
Han, Yoonchang .
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, :3025-3029
[7]  
Chen T.H.Y., 2020, ARXIV PREPRINT ARXIV
[8]  
Chen T, 2020, PR MACH LEARN RES, V119
[9]  
Gemmeke JF, 2017, INT CONF ACOUST SPEE, P776, DOI 10.1109/ICASSP.2017.7952261
[10]  
Grill J., 2020, Adv Neural Inf Process Syst