1D CNN Architectures for Music Genre Classification

被引:17
作者
Allamy, Safaa [1 ]
Koerich, Alessandro Lameiras [1 ]
机构
[1] Univ Quebec, Ecole Technol Super, Montreal, PQ, Canada
来源
2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021) | 2021年
关键词
Convolutional neural networks; deep learning; audio processing;
D O I
10.1109/SSCI50451.2021.9659979
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper proposes a 1D residual convolutional neural network (CNN) architecture for music genre classification and compares it with other recent 1D CNN architectures. The 1D CNNs learn a representation and a discriminant directly from the raw audio signal. Several convolutional layers capture the time-frequency characteristics of the audio signal and learn various filters relevant to the music genre recognition task. The proposed approach splits the audio signal into overlapped segments using a sliding window to comply with the fixed-length input constraint of the 1D CNNs. As a result, music genre classification can be carried out on a single audio segment or on aggregating the predictions on several audio segments, which improves the final accuracy. The performance of the proposed 1D residual CNN is assessed on a public dataset of 1,000 audio clips. The experimental results have shown that it achieves 80.93% of mean accuracy in classifying music genres and outperforms other 1D CNN architectures.
引用
收藏
页数:7
相关论文
共 31 条
[21]   On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-Western and ethnic music collections [J].
Lidy, Thomas ;
Silla, Carlos N., Jr. ;
Cornelis, Olmo ;
Gouyon, Fabien ;
Rauber, Andreas ;
Kaestner, Celso A. A. ;
Koerich, Alessandro L. .
SIGNAL PROCESSING, 2010, 90 (04) :1032-1048
[22]  
Pons J., 2018, 19 INT SOC MUSIC INF, P1
[23]  
Sainath TN, 2015, 16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5, P1
[24]   Automatic genre classification of music content [J].
Scaringella, N ;
Zoia, G ;
Mlynek, D .
IEEE SIGNAL PROCESSING MAGAZINE, 2006, 23 (02) :133-141
[25]  
Silla Carlos N. Jr., 2008, Journal of the Brazilian Computer Society, V14, P1, DOI 10.1007/BF03192561
[26]  
Silla Jr C.N., 2008, PROC ISMIR, P451
[27]  
Song G., 2017, INT C INT SCI, V510, P183
[28]  
Thickstun J., 2017, C TRACK P
[29]   Fast recognition of musical genres using RBF networks [J].
Turnbull, D ;
Elkan, C .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2005, 17 (04) :580-584
[30]   Musical genre classification of audio signals [J].
Tzanetakis, G ;
Cook, P .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2002, 10 (05) :293-302