Audio-Based Music Classification with DenseNet and Data Augmentation

被引：14

作者：

Bian, Wenhao ^{[1
,2
]}

Wang, Jie ^{[2
]}

Zhuang, Bojin ^{[2
]}

Yang, Jiankui ^{[1
]}

Wang, Shaojun ^{[2
]}

Xiao, Jing ^{[2
]}

机构：

[1] Beijing Univ Posts & Telecommn, Beijing, Peoples R China

[2] Ping An Technol Shenzhen Co Ltd, Shenzhen, Peoples R China

来源：

PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III | 2019年 / 11672卷

关键词：

Music classification; Spectrogram; CNN; ResNet; DenseNet; Deep learning;

D O I：

10.1007/978-3-030-29894-4_5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, deep learning technique has received intense attention owing to its great success in image recognition. A tendency of adaption of deep learning in various information processing fields has formed, including music information retrieval (MIR). In this paper, we conduct a comprehensive study on music audio classification with improved convolutional neural networks (CNNs). To the best of our knowledge, this the first work to apply Densely Connected Convolutional Networks (DenseNet) to music audio tagging, which has been demonstrated to perform better than Residual neural network (ResNet). Additionally, two specific data augmentation approaches of time overlapping and pitch shifting have been proposed to address the deficiency of labelled data in the MIR. Moreover, an ensemble learning of stacking is employed based on SVM. We believe that the proposed combination of strong representation of DenseNet and data augmentation can be adapted to other audio processing tasks.

引用

页码：56 / 65

页数：10

共 50 条

[41] Channel-Attention-Based DenseNet Network for Remote Sensing Image Scene Classification
Tong, Wei
Chen, Weitao
Han, Wei
Li, Xianju
Wang, Lizhe
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2020, 13 : 4121 - 4132
[42] Data Augmentation for Blind Signal Classification
Wang, Peng
Vindiola, Manuel
MILCOM 2019 - 2019 IEEE MILITARY COMMUNICATIONS CONFERENCE (MILCOM), 2019,
[43] Dermoscopy Image Classification Based on StyleGAN and DenseNet201
Zhao, Chen
Shuai, Renjun
Ma, Li
Liu, Wenjia
Hu, Die
Wu, Menglin
IEEE ACCESS, 2021, 9 : 8659 - 8679
[44] Benchmarking Audio-based Deep Learning Models for Detection and Identification of Unmanned Aerial Vehicles
Katta, Sai Srinadhu
Nandyala, Sivaprasad
Viegas, Eduardo Kugler
AlMahmoud, Abdelrahman
5TH WORKSHOP ON BENCHMARKING CYBER-PHYSICAL SYSTEMS AND INTERNET OF THINGS (CPS-IOTBENCH 2022), 2022, : 7 - 11
[45] Comparative study of data augmentation methods for fake audio detection
Park, KwanYeol
Kwak, Il-Youp
KOREAN JOURNAL OF APPLIED STATISTICS, 2023, 36 (02) : 101 - 114
[46] Parallel classification model of arrhythmia based on DenseNet-BiLSTM
Gan, Yi
Shi, Jun-cheng
He, Wei-ming
Sun, Fu-jia
BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2021, 41 (04) : 1548 - 1560
[47] DESPECKLING BASED DATA AUGMENTATION APPROACH IN DEEP LEARNING BASED RADAR TARGET CLASSIFICATION
Ceylan, S. H. Mert
Erer, Isin
2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 2706 - 2709
[48] Building a DenseNet-Based Neural Network with Transformer and MBConv Blocks for Penile Cancer Classification
Lauande, Marcos Gabriel Mendes
Braz Junior, Geraldo
de Almeida, Joao Dallyson Sousa
Silva, Aristofanes Correa
da Costa, Rui Miguel Gil
Teles, Amanda Mara
da Silva, Leandro Lima
Brito, Haissa Oliveira
Vidal, Flavia Castello Branco
do Vale, Joao Guilherme Araujo
Rodrigues Junior, Jose Ribamar Durand
Cunha, Antonio
APPLIED SCIENCES-BASEL, 2024, 14 (22):
[49] A new approach for classification of generic audio data
Lin, RS
Chen, LH
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2005, 19 (01) : 63 - 78
[50] REVISITING THE PROBLEM OF AUDIO-BASED HIT SONG PREDICTION USING CONVOLUTIONAL NEURAL NETWORKS
Yang, Li-Chia
Chou, Szu-Yu
Liu, Jen-Yu
Yang, Yi-Hsuan
Chen, Yi-An
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 621 - 625

← 1 2 3 4 5 →