Domestic Activities Classification from Audio Recordings Using Multi-scale Dilated Depthwise Separable Convolutional Network

被引:1
|
作者
Zeng, Yufei [1 ]
Li, Yanxiong [1 ]
Zhou, Zhenfeng [1 ]
Wang, Ruiqi [1 ]
Lu, Difeng [1 ]
机构
[1] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou, Peoples R China
来源
IEEE MMSP 2021: 2021 IEEE 23RD INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2021年
基金
中国国家自然科学基金;
关键词
Domestic activities classification; multi-scale embedding; dilated convolution; depthwise separable convolution; NEURAL-NETWORK; SCENE;
D O I
10.1109/MMSP53017.2021.9733646
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Domestic activities classification (DAC) from audio recordings aims at classifying audio recordings into pre-defined categories of domestic activities, which is an effective way for estimation of daily activities performed in home environment. In this paper, we propose a method for DAC from audio recordings using a multi-scale dilated depthwise separable convolutional network (DSCN). The DSCN is a lightweight neural network with small size of parameters and thus suitable to be deployed in portable terminals with limited computing resources. To expand the receptive field with the same size of DSCN's parameters, dilated convolution, instead of normal convolution, is used in the DSCN for further improving the DSCN's performance. In addition, the embeddings of various scales learned by the dilated DSCN are concatenated as a multi-scale embedding for representing property differences among various classes of domestic activities. Evaluated on a public dataset of the Task 5 of the 2018 challenge on Detection and Classification of Acoustic Scenes and Events (DCASE-2018), the results show that: both dilated convolution and multi-scale embedding contribute to the performance improvement of the proposed method; and the proposed method outperforms the methods based on state-of-the-art lightweight network in terms of classification accuracy.
引用
收藏
页数:5
相关论文
共 30 条
  • [1] Multi-Scale Dilated Convolutional Neural Network for Hyperspectral Image Classification
    Shanshan Zheng
    Wen Liu
    Rui Shan
    Jingyi Zhao
    Guoqian Jiang
    Zhi Zhang
    Journal of Harbin Institute of Technology(New Series), 2021, 28 (04) : 25 - 32
  • [2] CFCN: A Multi-scale Fully Convolutional Network with Dilated Convolution for Nuclei Classification and Localization
    Xin, Bin
    Yang, Yaning
    Wei, Dongqing
    Peng, Shaoliang
    BIOINFORMATICS RESEARCH AND APPLICATIONS, ISBRA 2021, 2021, 13064 : 314 - 323
  • [3] Automated arrhythmia classification using depthwise separable convolutional neural network with focal loss
    Lu, Yi
    Jiang, Mingfeng
    Wei, Liying
    Zhang, Jucheng
    Wang, Zhikang
    Wei, Bo
    Xia, Ling
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 69
  • [4] MS-CheXNet: An Explainable and Lightweight Multi-Scale Dilated Network with Depthwise Separable Convolution for Prediction of Pulmonary Abnormalities in Chest Radiographs
    Shetty, Shashank
    Ananthanarayana, V. S.
    Mahale, Ajit
    MATHEMATICS, 2022, 10 (19)
  • [5] Research on Multi-scale Residual UNet Fused with Depthwise Separable Convolution in PolSAR Terrain Classification
    Xie, Wen
    Wang, Ruonan
    Yang, Xin
    Li, Yongheng
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (08) : 2975 - 2985
  • [6] Multi-scale dilated convolution of convolutional neural network for crowd counting
    Yanjie Wang
    Shiyu Hu
    Guodong Wang
    Chenglizhao Chen
    Zhenkuan Pan
    Multimedia Tools and Applications, 2020, 79 : 1057 - 1073
  • [7] Multi-scale dilated convolution of convolutional neural network for image denoising
    Yanjie Wang
    Guodong Wang
    Chenglizhao Chen
    Zhenkuan Pan
    Multimedia Tools and Applications, 2019, 78 : 19945 - 19960
  • [8] Multi-scale dilated convolution of convolutional neural network for image denoising
    Wang, Yanjie
    Wang, Guodong
    Chen, Chenglizhao
    Pan, Zhenkuan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2019, 78 (14) : 19945 - 19960
  • [9] Multi-scale dilated convolution of convolutional neural network for crowd counting
    Wang, Yanjie
    Hu, Shiyu
    Wang, Guodong
    Chen, Chenglizhao
    Pan, Zhenkuan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (1-2) : 1057 - 1073
  • [10] Multi-scale Dilated Convolutional Neural Network Model Based on Attention Mechanism
    Wang J.
    Lai X.
    Lei J.
    Zhang J.
    Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence, 2021, 34 (06): : 497 - 508