1D CNN Architectures for Music Genre Classification

被引:16
作者
Allamy, Safaa [1 ]
Koerich, Alessandro Lameiras [1 ]
机构
[1] Univ Quebec, Ecole Technol Super, Montreal, PQ, Canada
来源
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021) | 2021年
关键词
Convolutional neural networks; deep learning; audio processing;
D O I
10.1109/SSCI50451.2021.9659979
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a 1D residual convolutional neural network (CNN) architecture for music genre classification and compares it with other recent 1D CNN architectures. The 1D CNNs learn a representation and a discriminant directly from the raw audio signal. Several convolutional layers capture the time-frequency characteristics of the audio signal and learn various filters relevant to the music genre recognition task. The proposed approach splits the audio signal into overlapped segments using a sliding window to comply with the fixed-length input constraint of the 1D CNNs. As a result, music genre classification can be carried out on a single audio segment or on aggregating the predictions on several audio segments, which improves the final accuracy. The performance of the proposed 1D residual CNN is assessed on a public dataset of 1,000 audio clips. The experimental results have shown that it achieves 80.93% of mean accuracy in classifying music genres and outperforms other 1D CNN architectures.
引用
收藏
页数:7
相关论文
共 31 条
[1]   End-to-end environmental sound classification using a 1D convolutional neural network [J].
Abdoli, Sajjad ;
Cardinal, Patrick ;
Koerich, Alessandro Lameiras .
EXPERT SYSTEMS WITH APPLICATIONS, 2019, 136 :252-263
[2]  
[Anonymous], 2016, Wav2Letter: An End-to-End ConvNet-based Speech Recognition System
[3]  
[Anonymous], 2017, ARXIV170704916
[4]  
Choi K, 2017, INT CONF ACOUST SPEE, P2392, DOI 10.1109/ICASSP.2017.7952585
[5]  
Costa CHL, 2004, IEEE SYS MAN CYBERN, P562
[6]   Music genre classification using LBP textural features [J].
Costa, Y. M. G. ;
Oliveira, L. S. ;
Koerich, A. L. ;
Gouyon, F. ;
Martins, J. G. .
SIGNAL PROCESSING, 2012, 92 (11) :2723-2737
[7]  
Costa Y, 2013, INT CONF SYST SIGNAL, P55, DOI 10.1109/IWSSIP.2013.6623448
[8]   An evaluation of Convolutional Neural Networks for music classification using spectrograms [J].
Costa, Yandre M. G. ;
Oliveira, Luiz S. ;
Silla, Carlos N., Jr. .
APPLIED SOFT COMPUTING, 2017, 52 :28-38
[9]  
Defferrard M., 2017, P 18 INT SOC MUSIC I
[10]  
Dieleman Sander, 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), P6964, DOI 10.1109/ICASSP.2014.6854950