Audio-Based Music Classification with DenseNet and Data Augmentation

被引：14

作者：

Bian, Wenhao ^{[1
,2
]}

Wang, Jie ^{[2
]}

Zhuang, Bojin ^{[2
]}

Yang, Jiankui ^{[1
]}

Wang, Shaojun ^{[2
]}

Xiao, Jing ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommn, Beijing, Peoples R China

[2] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China

来源：

PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III | 2019年 / 11672卷

关键词：

Music classification; Spectrogram; CNN; ResNet; DenseNet; Deep learning;

D O I：

10.1007/978-3-030-29894-4_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.

引用

页码：56 / 65

页数：10

共 50 条

[21] Audiogmenter: a MATLAB toolbox for audio data augmentation
Maguolo, Gianluca
Paci, Michelangelo
Nanni, Loris
Bonan, Ludovico
APPLIED COMPUTING AND INFORMATICS, 2025, 21 (1/2) : 152 - 163
[22] A full convolutional network based on DenseNet for remote sensing scene classification
Zhang, Jianming
Lu, Chaoquan
Li, Xudong
Kim, Hye-Jin
Wang, Jin
MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2019, 16 (05) : 3345 - 3367
[23] Deep Learning-Based Estimation of Reverberant Environment for Audio Data Augmentation
Yun, Deokgyu
Choi, Seung Ho
SENSORS, 2022, 22 (02)
[24] Audio-Based Aircraft Detection System for Safe RPAS BVLOS Operations
Mariscal-Harana, Jorge
Alarcon, Victor
Gonzalez, Fidel
Calvente, Juan Jose
Perez-Grau, Francisco Javier
Viguria, Antidio
Ollero, Anibal
ELECTRONICS, 2020, 9 (12) : 1 - 13
[25] Audio-based Deep Learning Algorithm to Identify Alcohol Inebriation (ADLAIA)
Bonela, Abraham Albert
He, Zhen
Nibali, Aiden
Norman, Thomas
Miller, Peter G.
Kuntsche, Emmanuel
ALCOHOL, 2023, 109 : 49 - 54
[26] Cancer classification with data augmentation based on generative adversarial networks
Kaimin Wei
Tianqi Li
Feiran Huang
Jinpeng Chen
Zefan He
Frontiers of Computer Science, 2022, 16
[27] Cancer classification with data augmentation based on generative adversarial networks
WEI Kaimin
LI Tianqi
HUANG Feiran
CHEN Jinpeng
HE Zefan
Frontiers of Computer Science, 2022, 16 (02)
[28] Cancer classification with data augmentation based on generative adversarial networks
Wei, Kaimin
Li, Tianqi
Huang, Feiran
Chen, Jinpeng
He, Zefan
FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (02)
[29] Classification of Handwriting Number Based on PCANet Network with Data Augmentation
Guo, Tianmei
Dong, Jiwen
Wang, Lei
PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN MECHANICAL ENGINEERING AND INDUSTRIAL INFORMATICS (AMEII 2016), 2016, 73 : 688 - 693
[30] ARTIFICIALLY SYNTHESISING DATA FOR AUDIO CLASSIFICATION AND SEGMENTATION TO IMPROVE SPEECH AND MUSIC DETECTION IN RADIO BROADCAST
Venkatesh, Satvik
Moffat, David
Kirke, Alexis
Shakeri, Gozel
Brewster, Stephen
Fachner, Jorg
Odell-Miller, Helen
Street, Alex
Farina, Nicolas
Banerjee, Sube
Miranda, Eduardo Reck
2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 636 - 640

← 1 2 3 4 5 →