Multi-Level and Multi-Scale Feature Aggregation Using Pretrained Convolutional Neural Networks for Music Auto-Tagging

被引:81
作者
Lee, Jongpil [1 ]
Nam, Juhan [1 ]
机构
[1] Korea Adv Inst Sci & Technol, Grad Sch Culture Technol, Daejeon 34141, South Korea
基金
新加坡国家研究基金会;
关键词
Convolutional neural networks; feature aggregation; music auto-tagging; transfer learning;
D O I
10.1109/LSP.2017.2713830
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Music auto-tagging is often handled in a similar manner to image classification by regarding the two-dimensional audio spectrogram as image data. However, music auto-tagging is distinguished from image classification in that the tags are highly diverse and have different levels of abstraction. Considering this issue, we propose a convolutional neural networks (CNN)-based architecture that embraces multi-level and multi-scaled features. The architecture is trained in three steps. First, we conduct supervised feature learning to capture local audio features using a set of CNNs with different input sizes. Second, we extract audio features from each layer of the pretrained convolutional networks separately and aggregate them altogether giving a long audio clip. Finally, we put them into fully connected networks and make final predictions of the tags. Our experiments show that using the combination of multi-level and multi-scale features is highly effective in music auto-tagging and the proposed method outperforms the previous state-of-the-art methods on the MagnaTagATune dataset and the Million Song Dataset. We further show that the proposed architecture is useful in transfer learning.
引用
收藏
页码:1208 / 1212
页数:5
相关论文
共 25 条
[1]  
[Anonymous], 2015, GITHUB REPOS
[2]  
[Anonymous], 2010, ISMIR
[3]  
[Anonymous], 2015, ARXIV PREPRINT ARXIV
[4]  
[Anonymous], 2012, P 13 INT SOC MUS INF
[5]  
[Anonymous], 2015, ARXIV150804999
[6]  
[Anonymous], 2016, ADV NEURAL INF PROCE
[7]  
[Anonymous], 2016, ARXIV E PRINTS
[8]  
[Anonymous], 2013, 14th International Society for Music Information Retrieval Conference (ISMIR-2013)
[9]  
[Anonymous], 2016, ISMIR
[10]  
[Anonymous], ISMIR