Fractional-Order Calculus-Based Data Augmentation Methods for Environmental Sound Classification with Deep Learning

被引:4
作者
Yazgac, Bilgi Gorkem [1 ]
Kirci, Murvet [1 ]
机构
[1] Istanbul Tech Univ, Dept Elect & Elect, TR-34469 Istanbul, Turkey
关键词
data augmentation; fractional order calculus; environmental sound classification; deep learning; NEURAL-NETWORK; RECOGNITION; BIFURCATIONS;
D O I
10.3390/fractalfract6100555
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
In this paper, we propose two fractional-order calculus-based data augmentation methods for audio signals. The first approach is based on fractional differentiation of the Mel scale. By using a randomly selected fractional derivation order, we are warping the Mel scale, therefore, we aim to augment Mel-scale-based time-frequency representations of audio data. The second approach is based on previous fractional-order image edge enhancement methods. Since multiple deep learning approaches treat Mel spectrogram representations like images, a fractional-order differential-based mask is employed. The mask parameters are produced with respect to randomly selected fractional-order derivative parameters. The proposed data augmentation methods are applied to the UrbanSound8k environmental sound dataset. For the classification of the dataset and testing the methods, an arbitrary convolutional neural network is implemented. Our results show that fractional-order calculus-based methods can be employed as data augmentation methods. Increasing the dataset size to six times the original size, the classification accuracy result increased by around 8.5%. Additional tests on more complex networks also produced better accuracy results compared to a non-augmented dataset. To our knowledge, this paper is the first example of employing fractional-order calculus as an audio data augmentation tool.
引用
收藏
页数:15
相关论文
共 52 条
[1]  
Adams M, 2019, Arxiv, DOI arXiv:1912.05303
[2]  
Al-Akaidi M., 2004, Fractal speech processing
[3]   Dynamic behavior of a fractional order prey-predator model with group defense [J].
Alidousti, Javad ;
Ghafari, Elham .
CHAOS SOLITONS & FRACTALS, 2020, 134
[4]   Stability and bifurcation analysis for a fractional prey-predator scavenger model [J].
Alidousti, Javad .
APPLIED MATHEMATICAL MODELLING, 2020, 81 :342-355
[5]  
[Anonymous], 2015, IEEE INT WORKS MACH
[6]  
[Anonymous], 2017, CoRRabs/1712.04621
[7]  
[Anonymous], 2015, ISMIR
[8]  
Assaleh K, 2007, 2007 9TH INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND ITS APPLICATIONS, VOLS 1-3, P1155
[9]   CNN-RNN and Data Augmentation Using Deep Convolutional Generative Adversarial Network for Environmental Sound Classification [J].
Bahmei, Behnaz ;
Birmingham, Elina ;
Arzanpour, Siamak .
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 :682-686
[10]   Low-dose CT image denoising using residual convolutional network with fractional TV loss [J].
Chen, Miao ;
Pu, Yi-Fei ;
Bai, Yu-Cai .
NEUROCOMPUTING, 2021, 452 :510-520