Audio-Based Music Classification with DenseNet and Data Augmentation

被引:14
作者
Bian, Wenhao [1 ,2 ]
Wang, Jie [2 ]
Zhuang, Bojin [2 ]
Yang, Jiankui [1 ]
Wang, Shaojun [2 ]
Xiao, Jing [2 ]
机构
[1] Beijing Univ Posts & Telecommn, Beijing, Peoples R China
[2] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China
来源
PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III | 2019年 / 11672卷
关键词
Music classification; Spectrogram; CNN; ResNet; DenseNet; Deep learning;
D O I
10.1007/978-3-030-29894-4_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.
引用
收藏
页码:56 / 65
页数:10
相关论文
共 50 条
  • [41] Channel-Attention-Based DenseNet Network for Remote Sensing Image Scene Classification
    Tong, Wei
    Chen, Weitao
    Han, Wei
    Li, Xianju
    Wang, Lizhe
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 4121 - 4132
  • [42] Data Augmentation for Blind Signal Classification
    Wang, Peng
    Vindiola, Manuel
    MILCOM 2019 - 2019 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2019,
  • [43] Dermoscopy Image Classification Based on StyleGAN and DenseNet201
    Zhao, Chen
    Shuai, Renjun
    Ma, Li
    Liu, Wenjia
    Hu, Die
    Wu, Menglin
    IEEE ACCESS, 2021, 9 : 8659 - 8679
  • [44] Benchmarking Audio-based Deep Learning Models for Detection and Identification of Unmanned Aerial Vehicles
    Katta, Sai Srinadhu
    Nandyala, Sivaprasad
    Viegas, Eduardo Kugler
    AlMahmoud, Abdelrahman
    5TH WORKSHOP ON BENCHMARKING CYBER-PHYSICAL SYSTEMS AND INTERNET OF THINGS (CPS-IOTBENCH 2022), 2022, : 7 - 11
  • [45] Comparative study of data augmentation methods for fake audio detection
    Park, KwanYeol
    Kwak, Il-Youp
    KOREAN JOURNAL OF APPLIED STATISTICS, 2023, 36 (02) : 101 - 114
  • [46] Parallel classification model of arrhythmia based on DenseNet-BiLSTM
    Gan, Yi
    Shi, Jun-cheng
    He, Wei-ming
    Sun, Fu-jia
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2021, 41 (04) : 1548 - 1560
  • [47] DESPECKLING BASED DATA AUGMENTATION APPROACH IN DEEP LEARNING BASED RADAR TARGET CLASSIFICATION
    Ceylan, S. H. Mert
    Erer, Isin
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 2706 - 2709
  • [48] Building a DenseNet-Based Neural Network with Transformer and MBConv Blocks for Penile Cancer Classification
    Lauande, Marcos Gabriel Mendes
    Braz Junior, Geraldo
    de Almeida, Joao Dallyson Sousa
    Silva, Aristofanes Correa
    da Costa, Rui Miguel Gil
    Teles, Amanda Mara
    da Silva, Leandro Lima
    Brito, Haissa Oliveira
    Vidal, Flavia Castello Branco
    do Vale, Joao Guilherme Araujo
    Rodrigues Junior, Jose Ribamar Durand
    Cunha, Antonio
    APPLIED SCIENCES-BASEL, 2024, 14 (22):
  • [49] A new approach for classification of generic audio data
    Lin, RS
    Chen, LH
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2005, 19 (01) : 63 - 78
  • [50] REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS
    Yang, Li-Chia
    Chou, Szu-Yu
    Liu, Jen-Yu
    Yang, Yi-Hsuan
    Chen, Yi-An
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 621 - 625