Domestic Activities Classification from Audio Recordings Using Multi-scale Dilated Depthwise Separable Convolutional Network

被引:1
|
作者
Zeng, Yufei [1 ]
Li, Yanxiong [1 ]
Zhou, Zhenfeng [1 ]
Wang, Ruiqi [1 ]
Lu, Difeng [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Peoples R China
来源
IEEE MMSP 2021: 2021 IEEE 23RD INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2021年
基金
中国国家自然科学基金;
关键词
Domestic activities classification; multi-scale embedding; dilated convolution; depthwise separable convolution; NEURAL-NETWORK; SCENE;
D O I
10.1109/MMSP53017.2021.9733646
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Domestic activities classification (DAC) from audio recordings aims at classifying audio recordings into pre-defined categories of domestic activities, which is an effective way for estimation of daily activities performed in home environment. In this paper, we propose a method for DAC from audio recordings using a multi-scale dilated depthwise separable convolutional network (DSCN). The DSCN is a lightweight neural network with small size of parameters and thus suitable to be deployed in portable terminals with limited computing resources. To expand the receptive field with the same size of DSCN's parameters, dilated convolution, instead of normal convolution, is used in the DSCN for further improving the DSCN's performance. In addition, the embeddings of various scales learned by the dilated DSCN are concatenated as a multi-scale embedding for representing property differences among various classes of domestic activities. Evaluated on a public dataset of the Task 5 of the 2018 challenge on Detection and Classification of Acoustic Scenes and Events (DCASE-2018), the results show that: both dilated convolution and multi-scale embedding contribute to the performance improvement of the proposed method; and the proposed method outperforms the methods based on state-of-the-art lightweight network in terms of classification accuracy.
引用
收藏
页数:5
相关论文
共 30 条
  • [11] Speaker verification using attentive multi-scale convolutional recurrent network
    Li, Yanxiong
    Jiang, Zhongjie
    Cao, Wenchang
    Huang, Qisheng
    APPLIED SOFT COMPUTING, 2022, 126
  • [12] Utilization of relative context for text non-text region classification in offline documents using multi-scale dilated convolutional neural network
    Bhowmik, Showmik
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (09) : 26751 - 26774
  • [13] Utilization of relative context for text non-text region classification in offline documents using multi-scale dilated convolutional neural network
    Showmik Bhowmik
    Multimedia Tools and Applications, 2024, 83 : 26751 - 26774
  • [14] Multi-scale fully convolutional network for gland segmentation using three-class classification
    Ding, Huijun
    Pan, Zhanpeng
    Cen, Qian
    Li, Yang
    Chen, Shifeng
    NEUROCOMPUTING, 2020, 380 (380) : 150 - 161
  • [15] Multi-Scale Dilated Convolutional Neural Network Based Multi-Focus Image Fusion Algorithm
    Yin Haitao
    Zhou Wei
    LASER & OPTOELECTRONICS PROGRESS, 2023, 60 (02)
  • [16] Automated classification of cervical lymph-node-level from ultrasound using Depthwise Separable Convolutional Swin Transformer
    Liu, Yanting
    Zhao, Junjuan
    Luo, Quanyong
    Shen, Chentian
    Wang, Ren
    Ding, Xuehai
    COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 148
  • [17] High-Quality Image Compressed Sensing and Reconstruction with Multi-scale Dilated Convolutional Neural Network
    Wang, Zhifeng
    Wang, Zhenghui
    Zeng, Chunyan
    Yu, Yan
    Wan, Xiangkui
    CIRCUITS SYSTEMS AND SIGNAL PROCESSING, 2022, 42 (3) : 1593 - 1616
  • [18] MSFF: A Multi-Scale Feature Fusion Convolutional Neural Network for Hyperspectral Image Classification
    Gong, Gu
    Wang, Xiaopeng
    Zhang, Jiahua
    Shang, Xiaodi
    Pan, Zhicheng
    Li, Zhiyuan
    Zhang, Junshi
    ELECTRONICS, 2025, 14 (04):
  • [19] High-Quality Image Compressed Sensing and Reconstruction with Multi-scale Dilated Convolutional Neural Network
    Zhifeng Wang
    Zhenghui Wang
    Chunyan Zeng
    Yan Yu
    Xiangkui Wan
    Circuits, Systems, and Signal Processing, 2023, 42 : 1593 - 1616
  • [20] Image Deblurring using Multi-Scale Dilated Convolutions in a LSTM-based Neural Network
    Richmond, Greig
    Cole-Rhodes, Arlene
    2021 55TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2021,