Audio-Based Music Classification with DenseNet and Data Augmentation

被引:14
作者
Bian, Wenhao [1 ,2 ]
Wang, Jie [2 ]
Zhuang, Bojin [2 ]
Yang, Jiankui [1 ]
Wang, Shaojun [2 ]
Xiao, Jing [2 ]
机构
[1] Beijing Univ Posts & Telecommn, Beijing, Peoples R China
[2] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China
来源
PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III | 2019年 / 11672卷
关键词
Music classification; Spectrogram; CNN; ResNet; DenseNet; Deep learning;
D O I
10.1007/978-3-030-29894-4_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.
引用
收藏
页码:56 / 65
页数:10
相关论文
共 50 条
  • [21] Audiogmenter: a MATLAB toolbox for audio data augmentation
    Maguolo, Gianluca
    Paci, Michelangelo
    Nanni, Loris
    Bonan, Ludovico
    APPLIED COMPUTING AND INFORMATICS, 2025, 21 (1/2) : 152 - 163
  • [22] A full convolutional network based on DenseNet for remote sensing scene classification
    Zhang, Jianming
    Lu, Chaoquan
    Li, Xudong
    Kim, Hye-Jin
    Wang, Jin
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2019, 16 (05) : 3345 - 3367
  • [23] Deep Learning-Based Estimation of Reverberant Environment for Audio Data Augmentation
    Yun, Deokgyu
    Choi, Seung Ho
    SENSORS, 2022, 22 (02)
  • [24] Audio-Based Aircraft Detection System for Safe RPAS BVLOS Operations
    Mariscal-Harana, Jorge
    Alarcon, Victor
    Gonzalez, Fidel
    Calvente, Juan Jose
    Perez-Grau, Francisco Javier
    Viguria, Antidio
    Ollero, Anibal
    ELECTRONICS, 2020, 9 (12) : 1 - 13
  • [25] Audio-based Deep Learning Algorithm to Identify Alcohol Inebriation (ADLAIA)
    Bonela, Abraham Albert
    He, Zhen
    Nibali, Aiden
    Norman, Thomas
    Miller, Peter G.
    Kuntsche, Emmanuel
    ALCOHOL, 2023, 109 : 49 - 54
  • [26] Cancer classification with data augmentation based on generative adversarial networks
    Kaimin Wei
    Tianqi Li
    Feiran Huang
    Jinpeng Chen
    Zefan He
    Frontiers of Computer Science, 2022, 16
  • [27] Cancer classification with data augmentation based on generative adversarial networks
    WEI Kaimin
    LI Tianqi
    HUANG Feiran
    CHEN Jinpeng
    HE Zefan
    Frontiers of Computer Science, 2022, 16 (02)
  • [28] Cancer classification with data augmentation based on generative adversarial networks
    Wei, Kaimin
    Li, Tianqi
    Huang, Feiran
    Chen, Jinpeng
    He, Zefan
    FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (02)
  • [29] Classification of Handwriting Number Based on PCANet Network with Data Augmentation
    Guo, Tianmei
    Dong, Jiwen
    Wang, Lei
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS (AMEII 2016), 2016, 73 : 688 - 693
  • [30] ARTIFICIALLY SYNTHESISING DATA FOR AUDIO CLASSIFICATION AND SEGMENTATION TO IMPROVE SPEECH AND MUSIC DETECTION IN RADIO BROADCAST
    Venkatesh, Satvik
    Moffat, David
    Kirke, Alexis
    Shakeri, Gozel
    Brewster, Stephen
    Fachner, Jorg
    Odell-Miller, Helen
    Street, Alex
    Farina, Nicolas
    Banerjee, Sube
    Miranda, Eduardo Reck
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 636 - 640