Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music Using Discrete Wavelet Transform

被引:0
|
作者
Dash, Sukanta Kumar [1 ]
Solanki, S. S. [1 ]
Chakraborty, Soubhik [2 ]
机构
[1] Birla Inst Technol, Dept Elect & Commun Engn, Ranchi 835215, Jharkhand, India
[2] Birla Inst Technol, Dept Math, Ranchi 835215, Jharkhand, India
关键词
Predominant instrument recognition; Deep convolutional neural networks; Mel-spectrogram; MFCC; DWT; PSO;
D O I
10.1007/s00034-024-02641-1
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this article, a new multi-input deep convolutional neural networks (deep-CNNs) model architecture is addressed for the recognition of predominant instruments in polyphonic music using discrete wavelet transform (DWT). The proposed deep-CNNs model employs a fusion of Mel-spectrogram and Mel-frequency cepstral coefficient (MFCC) features as its first input and a concatenation of statistical features extracted from decomposed signals obtained through DWT as its second input. Particle swarm optimization (PSO), a feature selection algorithm, is employed to minimize the feature dimensionality by excluding the irrelevant features. The proposed model is experimentally tested on the IRMAS dataset using fixed-length single-labeled train data for model training and variable-length multi-labeled test data for model evaluation. The proposed model is evaluated using several DWT feature dimensions, and a feature dimension of 250 yields the best outcomes. The model performance is assessed by averaging the precision, recall, and F1 measures on a micro- and macro-level. For a set of optimal model hyperparameter values, our proposed model can reach micro and macro F1 measures of 0.695 and 0.631, which are 12.28% and 23.0% greater as compared to the benchmark Han et al. (IEEE/ACM Trans Audio Speech Lang Process 25(1):208-221, 2016. https://doi.org/10.1109/taslp.2016.2632307) CNN model, respectively.
引用
收藏
页码:4239 / 4271
页数:33
相关论文
共 50 条
  • [1] Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music
    Han, Yoonchang
    Kim, Jaehun
    Lee, Kyogu
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 208 - 221
  • [2] Music instrument recognition using deep convolutional neural networks
    Solanki A.
    Pandey S.
    International Journal of Information Technology, 2022, 14 (3) : 1659 - 1668
  • [3] Predominant instrument recognition from polyphonic music using feature fusion
    Ajayakumar, Roshni
    Rajan, Rajeev
    EMERGING TRENDS IN ENGINEERING, SCIENCE AND TECHNOLOGY FOR SOCIETY, ENERGY AND ENVIRONMENT, 2018, : 721 - 726
  • [4] Music emotion recognition using deep convolutional neural networks
    Li, Ting
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2024, 24 (4-5) : 3063 - 3078
  • [5] Compression, Denoising and Classification of ECG Signals using the Discrete Wavelet Transform and Deep Convolutional Neural Networks
    Chowdhury, M.
    Poudel, K.
    Hu, Y.
    2020 IEEE SIGNAL PROCESSING IN MEDICINE AND BIOLOGY SYMPOSIUM, 2020,
  • [6] Iris recognition using discrete wavelet transform and artificial neural networks
    Alim, OA
    Sharkas, M
    Proceedings of the 46th IEEE International Midwest Symposium on Circuits & Systems, Vols 1-3, 2003, : 337 - 340
  • [7] Augmentation Embedded Deep Convolutional Neural Network for Predominant Instrument Recognition
    Zhang, Jian
    Bai, Na
    APPLIED SCIENCES-BASEL, 2023, 13 (18):
  • [8] Environmental sound recognition using continuous wavelet transform and convolutional neural networks
    Mondragón F.J.
    Pérez-Meana H.M.
    Calderón G.
    Jiménez J.
    Informacion Tecnologica, 2021, 32 (02): : 61 - 78
  • [9] Wavelet Transform for the Analysis of Convolutional Neural Networks in Texture Recognition
    Florindo, Joao Batista
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 502 - 509
  • [10] Facial Age Estimation based on Discrete Wavelet Transform-Deep Convolutional Neural Networks
    Chen, Yen-Feng
    Chen, Wen-Shiung
    2019 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TW), 2019,