Audio-Based Music Classification with DenseNet and Data Augmentation

被引：14

作者：

Bian, Wenhao ^{[1
,2
]}

Wang, Jie ^{[2
]}

Zhuang, Bojin ^{[2
]}

Yang, Jiankui ^{[1
]}

Wang, Shaojun ^{[2
]}

Xiao, Jing ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommn, Beijing, Peoples R China

[2] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China

来源：

PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III | 2019年 / 11672卷

关键词：

Music classification; Spectrogram; CNN; ResNet; DenseNet; Deep learning;

D O I：

10.1007/978-3-030-29894-4_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.

引用

页码：56 / 65

页数：10

共 50 条

[31] Wildfire-Detection Method Using DenseNet and CycleGAN Data Augmentation-Based Remote Camera Imagery
Park, Minsoo
Dai Quoc Tran
Jung, Daekyo
Park, Seunghee
REMOTE SENSING, 2020, 12 (22) : 1 - 16
[32] Conditional Generative Data Augmentation for Clinical Audio Datasets
Seibold, Matthias
Hoch, Armando
Farshad, Mazda
Navab, Nassir
Fuernstahl, Philipp
MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VII, 2022, 13437 : 345 - 354
[33] Surveillance audio-based rainfall observation: An enhanced strategy for extreme rainfall observation
Wang, Xing
Glade, Thomas
Schmaltz, Elmar
Liu, Xuejun
APPLIED ACOUSTICS, 2023, 211
[34] Multi-Task Learning for Audio-Based Infant Cry Detection and Reasoning
Xia, Ming
Huang, Dongmin
Wang, Wenjin
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2024, 28 (12) : 7434 - 7446
[35] Enhancing Intake Monitoring: Transfer Learning for Audio-Based Detection of Swallowing Events
Chen, Xin
Kamavuako, Ernest
2024 IEEE 22ND MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, MELECON 2024, 2024, : 479 - 484
[36] A novel scene classification model combining ResNet based transfer learning and data augmentation with a filter
Liu, Shaopeng
Tian, Guohui
Xu, Yuan
NEUROCOMPUTING, 2019, 338 : 191 - 206
[37] DATA AUGMENTATION FOR CHEST PATHOLOGIES CLASSIFICATION
Sirazitdinov, Ilyas
Kholiavchenko, Maksym
Kuleev, Ramil
Ibragimov, Bulat
2019 IEEE 16TH INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING (ISBI 2019), 2019, : 1216 - 1219
[38] Data Augmentation-Based Enhancement for Efficient Network Traffic Classification
Shin, Chang-Yui
Choi, Yang-Seo
Kim, Myung-Sup
IEEE ACCESS, 2025, 13 : 6006 - 6028
[39] Data Augmentation for Deep Learning-Based Radio Modulation Classification
Huang, Liang
Pan, Weijian
Zhang, You
Qian, Liping
Gao, Nan
Wu, Yuan
IEEE ACCESS, 2020, 8 : 1498 - 1506
[40] DualDiscWaveGAN-Based Data Augmentation Scheme for Animal Sound Classification
Kim, Eunbeen
Moon, Jaeuk
Shim, Jonghwa
Hwang, Eenjun
SENSORS, 2023, 23 (04)

← 1 2 3 4 5 →