Multi-label emotion recognition from Indian classical music using gradient descent SNN model

被引：6

作者：

Tiple, Bhavana ^{[1
]}

Patwardhan, Manasi ^{[2
]}

机构：

[1] Dr Vishwanath Karad MIT World Peace Univ, Sch SCET, Pune, Maharashtra, India

[2] TCS Innovat Labs, Pune, Maharashtra, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2022年 / 81卷 / 06期

关键词：

Convolutional neural network; Spiking neural network; Gradient descent; Temporal; Spectral; Short Term Fourier Transform; SPIKING NEURAL-NETWORK; CLASSIFICATION;

D O I：

10.1007/s11042-022-11975-4

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Music enthusiasts are growing exponentially and based on this, many songs are being introduced to the market and stored in signal music libraries. Due to this development emotion recognition model from music contents has received increasing attention in today's world. Of these technologies, a novel Music Emotion Recognition (MER) system is introduced to meet the ever-increasing demand for easy and efficient access to music information. Even though this system was well-developed it lacks in maintaining accuracy of the system and finds difficulty in predicting multi-label emotion type. To address these shortcomings, in this research article, a novel MER system is developed by inter-linking the pre-processing, feature extraction and classification steps. Initially, pre-processing step is employed to convert larger audio files into smaller audio frames. Afterwards, music related temporal, spectral and energy features are extracted for those pre-processed frames which are subjected to the proposed gradient descent based Spiking Neural Network (SNN) classifier. While learning SNN, it is important to determine the optimal weight values for reducing the training error so that gradient descent optimization approach is adopted. To prove the effectiveness of proposed research, proposed model is compared with conventional classification algorithms. The proposed methodology was experimentally tested using various evaluation metrics and it achieves 94.55% accuracy. Hence the proposed methodology attains a good accuracy measure and outperforms well than other algorithms.

引用

页码：8853 / 8870

页数：18

共 32 条

[1]

Anyu Chen, 2010, Proceedings of 3rd International Congress on Image and Signal Processing (CISP 2010), P4088, DOI 10.1109/CISP.2010.5646222

[2] Emotion classification using flexible analytic wavelet transform for electroencephalogram signals [J].

Bajaj V. ;

Taran S. ;

Sengur A. .

Health Information Science and Systems, 6 (1)

[3] Rough set-based approach for automatic emotion classification of music [J].

Baniya, Babu Kaji ;

Lee, Joonwhoan .

Journal of Information Processing Systems, 2017, 13 (02) :400-416

[4]

Barthet Mathieu, 2012, INT S COMP MUS MOD R, P228, DOI DOI 10.1007/978-3-642-41248-6_13

[5] Musical Creativity "Revealed" in Brain Structure: Interplay between Motor, Default Mode, and Limbic Networks [J].

Bashwiner, David M. ;

Wertz, Christopher J. ;

Flores, Ranee A. ;

Jung, Rex E. .

SCIENTIFIC REPORTS, 2016, 6

[6]

Baume C, 2013, P AUD ENG SOC CONV, P134

[7] Human emotion recognition and analysis in response to audio music using brain signals [J].

Bhatti, Adnan Mehmood ;

Majid, Muhammad ;

Anwar, Syed Muhammad ;

Khan, Bilal .

COMPUTERS IN HUMAN BEHAVIOR, 2016, 65 :267-275

[8]

Buscicchio CA, 2006, LECT NOTES ARTIF INT, V4203, P38

[9] A spiking neural network-based long-term prediction system for biogas production [J].

Capizzi, Giacomo ;

Sciuto, Grazia Lo ;

Napoli, Christian ;

Wozniak, Marcin ;

Susi, Gianluca .

NEURAL NETWORKS, 2020, 129 :271-279

[10]

Charles, 2019, APPLICABILITY MIREMO

← 1 2 3 4 →