A comprehensive study based on MFCC and spectrogram for audio classification

被引:2
作者
Rawat, Priyanshu [1 ]
Bajaj, Madhvan [1 ]
Vats, Satvik [1 ]
Sharma, Vikrant [1 ]
机构
[1] Graphic Era Hill Univ, Dept Comp Sci & Engn, Dehra Dun, Uttarakhand, India
关键词
Music genre classification; Convolution Neural Network (CNN); Signal processing; Image processing; Spectrogram; Artificial Neural Network (ANN); Audio analysis; MODEL; RECOGNITION; DEPLOYMENT;
D O I
10.47974/JIOS-1431
中图分类号
G25 [图书馆学、图书馆事业]; G35 [情报学、情报工作];
学科分类号
1205 ; 120501 ;
摘要
Music Assortment is a music information retrieval (MIR) function to decide music connotation computationally. In recent years, deep neural networks have been proven to be effective in numerous classification tasks, including music genre categorisation. In this paper, we employ a comparative study between the two different music classification techniques. The first technique uses the audio's spectrogram image and computes the music's genre based on its spectrogram, using the CNN model trained on the spectrograms. The second approach computes the MFCC's (Mel-Frequency Cepstral Coefficients) musical features and utilises them to classify the music using ANN. This paper aims to study the two algorithms closely against different audio signals and check the performance report of the above-mentioned techniques to see which of them is better for music genre classification.
引用
收藏
页码:1057 / 1074
页数:18
相关论文
共 41 条
[1]  
Agarwal R., 2018, Adv. Intell. Syst. Comput., V654, P403, DOI [10.1007/978-981-10-6620-7_38/COVER, DOI 10.1007/978-981-10-6620-7_38/COVER]
[2]  
Agarwal R, 2016, 2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), P13, DOI 10.1109/CCAA.2016.7813719
[3]  
[Anonymous], 2011, Music genre classification
[4]   HgsDb: Haplogroups Database to understand migration and molecular risk assessment [J].
Arora, Devender ;
Singh, Ajeet ;
Sharma, Vikrant ;
Bhaduria, Harvendra Singh ;
Patel, Ram Bahadur .
BIOINFORMATION, 2015, 11 (06) :272-275
[5]  
Baniya Babu Kaji, 2015, 2015 17th International Conference on Advanced Communication Technology (ICACT), P434, DOI 10.1109/ICACT.2015.7224907
[6]  
Bhati J. P., 2018, Examining Big Data Management Techniques for Cloud-Based IoT Systems, P164, DOI [10.4018/978-1-5225-3445-7.CH009, DOI 10.4018/978-1-5225-3445-7.CH009]
[7]   Multi-Level P2P Traffic Classification Using Heuristic and Statistical-Based Techniques: A Hybrid Approach [J].
Bhatia, Max ;
Sharma, Vikrant ;
Singh, Parminder ;
Masud, Mehedi .
SYMMETRY-BASEL, 2020, 12 (12) :1-22
[8]   Music genre classification using LBP textural features [J].
Costa, Y. M. G. ;
Oliveira, L. S. ;
Koerich, A. L. ;
Gouyon, F. ;
Martins, J. G. .
SIGNAL PROCESSING, 2012, 92 (11) :2723-2737
[9]  
Elbir Ahmet, 2018, 2018 INN INT SYST AP, P1, DOI [10.1109/ ASYU. 2018.8554016, DOI 10.1109/ASYU.2018.8554016]
[10]   Emotion recognition of audio/speech data using deep learning approaches [J].
Gupta, Vedika ;
Juyal, Stuti ;
Singh, Gurvinder Pal ;
Killa, Chirag ;
Gupta, Nishant .
JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2020, 41 (06) :1309-1317