Improving Audio Classification Method by Combining Self-Supervision with Knowledge Distillation

被引:0
|
作者
Gong, Xuchao [1 ]
Duan, Hongjie [1 ]
Yang, Yaozhong [1 ]
Tan, Lizhuang [2 ,3 ]
Wang, Jian [4 ]
Vasilakos, Athanasios V. [5 ]
机构
[1] Shengli Petr Management Bur, Artificial Intelligence Res Inst, Dongying 257000, Peoples R China
[2] Qilu Univ Technol, Shandong Acad Sci, Shandong Comp Sci Ctr, Natl Supercomp Ctr Jinan,Key Lab Comp Power Networ, Jinan 250013, Peoples R China
[3] Shandong Fundamental Res Ctr Comp Sci, Shandong Prov Key Lab Comp Networks, Jinan 250013, Peoples R China
[4] China Univ Petr East China, Coll Sci, Qingdao 266580, Peoples R China
[5] Univ Agder UiA, Ctr AI Res CAIR, Dept ICT, N-4879 Grimstad, Norway
基金
中国国家自然科学基金;
关键词
audio classification; comparative learning; knowledge distillation; masked auto-encoder; self-supervision; transformer; REPRESENTATION;
D O I
10.3390/electronics13010052
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The current audio single-mode self-supervised classification mainly adopts a strategy based on audio spectrum reconstruction. Overall, its self-supervised approach is relatively single and cannot fully mine key semantic information in the time and frequency domains. In this regard, this article proposes a self-supervised method combined with knowledge distillation to further improve the performance of audio classification tasks. Firstly, considering the particularity of the two-dimensional audio spectrum, both self-supervised strategy construction is carried out in a single dimension in the time and frequency domains, and self-supervised construction is carried out in the joint dimension of time and frequency. Effectively learn audio spectrum details and key discriminative information through information reconstruction, comparative learning, and other methods. Secondly, in terms of feature self-supervision, two learning strategies for teacher-student models are constructed, which are internal to the model and based on knowledge distillation. Fitting the teacher's model feature expression ability, further enhances the generalization of audio classification. Comparative experiments were conducted using the AudioSet dataset, ESC50 dataset, and VGGSound dataset. The results showed that the algorithm proposed in this paper has a 0.5% to 1.3% improvement in recognition accuracy compared to the optimal method based on audio single mode.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Learning dual disentangled representation with self-supervision for temporal knowledge graph reasoning
    Xiao, Yao
    Zhou, Guangyou
    Xie, Zhiwen
    Liu, Jin
    Huang, Jimmy Xiangji
    INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (03)
  • [42] SKGCR: self-supervision enhanced knowledge-aware graph collaborative recommendation
    Xiangkun Liu
    Bo Yang
    Jingyu Xu
    Applied Intelligence, 2023, 53 : 19872 - 19891
  • [43] Audio-based anomaly detection on edge devices via self-supervision and spectral analysis
    Lo Scudo, Fabrizio
    Ritacco, Ettore
    Caroprese, Luciano
    Manco, Giuseppe
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2023, 61 (03) : 765 - 793
  • [44] Self-supervision Based Dual-Transformation Learning for Stain Normalization, Classification and Segmentation
    Gehlot, Shiv
    Gupta, Anubha
    MACHINE LEARNING IN MEDICAL IMAGING, MLMI 2021, 2021, 12966 : 477 - 486
  • [45] A Novel Knowledge Distillation Method for Self-Supervised Hyperspectral Image Classification
    Chi, Qiang
    Lv, Guohua
    Zhao, Guixin
    Dong, Xiangjun
    REMOTE SENSING, 2022, 14 (18)
  • [46] PVASS-MDD: Predictive Visual-Audio Alignment Self-Supervision for Multimodal Deepfake Detection
    Yu, Yang
    Liu, Xiaolong
    Ni, Rongrong
    Yang, Siyuan
    Zhao, Yao
    Kot, Alex C.
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (08) : 6926 - 6936
  • [47] Improving Semi-Supervised Learning for Remaining Useful Lifetime Estimation Through Self-Supervision
    Krokotsch, Tilman
    Knaak, Mirko
    Guehmann, Clemens
    INTERNATIONAL JOURNAL OF PROGNOSTICS AND HEALTH MANAGEMENT, 2022, 13 (01) : 1 - 19
  • [48] Improving Model-Based Reinforcement Learning with Internal State Representations through Self-Supervision
    Scholz, Julien
    Weber, Cornelius
    Hafez, Muhammad Burhan
    Wermter, Stefan
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [49] Self-supervision assisted multimodal remote sensing image classification with coupled self-looping convolution networks
    Pande, Shivam
    Banerjee, Biplab
    NEURAL NETWORKS, 2023, 164 : 1 - 20
  • [50] Domain-guided Self-supervision of EEG Data Improves Downstream Classification Performance and Generalizability
    Wagh, Neeraj
    Wei, Jionghao
    Rawal, Samarth
    Berry, Brent
    Barnard, Leland
    Brinkmann, Benjamin
    Worrell, Gregory
    Jones, David
    Varatharajah, Yogatheesan
    MACHINE LEARNING FOR HEALTH, VOL 158, 2021, 158 : 130 - 142