A lightweight framework for unsupervised anomalous sound detection based on selective learning of time-frequency domain features

被引:1
|
作者
Wang, Yawei [1 ]
Zhang, Qiaoling [1 ,2 ]
Zhang, Weiwei [3 ]
Zhang, Yi [4 ]
机构
[1] Zhejiang Sci Tech Univ, Sch Comp Sci & Technol, Sch Artificial Intelligence, Hangzhou 310018, Peoples R China
[2] Zhejiang Sci Tech Univ, Key Lab Intelligent Text & Flexible Interconnect Z, Hangzhou 310018, Peoples R China
[3] Dalian Maritime Univ, Informat Sci & Technol Coll, Dalian 116026, Peoples R China
[4] Zhejiang Sci Tech Univ, Sch Informat Sci & Engn, Hangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Anomalous sound detection; Spectrogram frames selection; Frequency-feature selection; Unsupervised deep learning;
D O I
10.1016/j.apacoust.2024.110308
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
For industrial anomalous sound detection (ASD), self-supervised methods have achieved significant detection performance in many cases. Nevertheless, these methods typically rely on the availability of external auxiliary information, and they may not work when such information are not feasible. Unsupervised methods do not leverage auxiliary information, whereas they usually obtained lower detection performance compared to self- supervised ones. Though some unsupervised methods have shown potential performance improvements, they are at the cost of complex implementation or large model sizes. As to the issues, this paper presents an unsupervised ASD method based on spectrogram frames selection (SFS) and AutoEncoder for Frequency-feature Selection (AEFS), called SFS-AEFS. First, SFS is developed based upon the temporal characteristics of machine sounds, which aims to select spectrogram frames (SFs) that contains the primary sound information while discarding the portions that are affected by noises or interferences or do not contain the target sound. Next, AEFS is developed by introducing a Scaling Gate (SG) after AE. For the selected SF features, AEFS aims to selectively enhance the mode learning of partial frequency dimensions and weaken the rest ones. Comparative experiments with the current ASD methods were made on the DCASE 2020 Challenge Task2 dataset. The related results demonstrate that our method achieved the best performance among all relevant unsupervised methods and is comparable to the current SOTA self-supervised methods. Moreover, our method is lightweight with model parameters being only 0.08MB.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Unsupervised Anomalous Sound Detection Using Hybrid Machine Learning Techniques
    Yun, Eunsun
    Jeong, Minjoong
    2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024, 2024, : 347 - 348
  • [22] JL-TFMSFNet: A domestic cat sound emotion recognition method based on jointly learning the time-frequency domain and multi-scale features
    Tang, Lu
    Hu, Shipeng
    Yang, Choujun
    Deng, Rui
    Chen, Aibin
    Zhou, Guoxiong
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 255
  • [23] RADAR-BASED FALL DETECTION EXPLOITING TIME-FREQUENCY FEATURES
    Rivera, Luis Ramirez
    Ulmer, Eric
    Zhang, Yimin D.
    Tao, Wenbing
    Amin, Moeness G.
    2014 IEEE CHINA SUMMIT & INTERNATIONAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (CHINASIP), 2014, : 713 - 717
  • [24] An Adversarial Time-Frequency Reconstruction Network for Unsupervised Anomaly Detection
    Fan, Jin
    Wang, Zehao
    Wu, Huifeng
    Sun, Danfeng
    Wu, Jia
    Lu, Xin
    NEURAL NETWORKS, 2023, 168 : 44 - 56
  • [25] Crackles detection method based on time-frequency features analysis and SVM
    Li, Jiarui
    Hong, Ying
    PROCEEDINGS OF 2016 IEEE 13TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP 2016), 2016, : 1412 - 1416
  • [26] Abnormal Heart Sound Detection using Time-Frequency Analysis and Machine Learning Techniques
    Nia, Parastoo Sadeghi
    Hesar, Hamed Danandeh
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 90
  • [27] Resolution Consistency Training on Time-Frequency Domain for Semi-Supervised Sound Event Detection
    Choi, Won-Gook
    Chang, Joon-Hyuk
    INTERSPEECH 2023, 2023, : 286 - 290
  • [28] Transformer and Graph Convolution-Based Unsupervised Detection of Machine Anomalous Sound Under Domain Shifts
    Yan, Jingke
    Cheng, Yao
    Wang, Qin
    Liu, Lei
    Zhang, Weihua
    Jin, Bo
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2827 - 2842
  • [29] Lightweight Fusion Model with Time-Frequency Features for Speech Emotion Recognition
    Zhang, Peng
    Li, Meijuan
    Zhao, Hui
    Chen, Yida
    Wang, Fuqiang
    Li, Ye
    Zhao, Wei
    PROCEEDINGS OF THE 2024 27 TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN, CSCWD 2024, 2024, : 3017 - 3022
  • [30] Joint Time-Frequency and Time Domain Learning for Speech Enhancement
    Tang, Chuanxin
    Luo, Chong
    Zhao, Zhiyuan
    Xie, Wenxuan
    Zeng, Wenjun
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3816 - 3822