Unsupervised feature learning for environmental sound classification using Weighted Cycle-Consistent Generative Adversarial Network

被引：36

作者：

Esmaeilpour, Mohammad ^{[1
]}

Cardinal, Patrick ^{[1
]}

Koerich, Alessandro Lameiras ^{[1
]}

机构：

[1] Univ Quebec, ETS, 1100 Notre Dame West, Montreal, PQ H3C 1K3, Canada

来源：

APPLIED SOFT COMPUTING | 2020年 / 86卷

基金：

加拿大自然科学与工程研究理事会;

关键词：

Environmental sound classification; Generative Adversarial Network (GAN); Cycle-Consistent GAN; K-means plus; Random forests; QUALITY ASSESSMENT; AUDIO; RECOGNITION;

D O I：

10.1016/j.asoc.2019.105912

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper we propose a novel environmental sound classification approach incorporating unsupervised feature learning via the spherical K-Means++ algorithm and a new architecture for high-level data augmentation. The audio signal is transformed into a 2D representation using a discrete wavelet transform (DWT). The DWT spectrograms are then augmented by a novel architecture for cycle-consistent generative adversarial network. This high-level augmentation bootstraps generated spectrograms in both intra-and inter-class manners by translating structural features from sample to sample. A codebook is built by coding the DWT spectrograms with the speeded-up robust feature detector and the K-Means++ algorithm. The Random forest is the final learning algorithm which learns the environmental sound classification task from the code vectors. Experimental results in four benchmarking environmental sound datasets (ESC-10, ESC-50, UrbanSound8k, and DCASE-2017) have shown that the proposed classification approach outperforms most of the state-of-the-art classifiers, including convolutional neural networks such as AlexNet and GoogLeNet, improving the classification rate between 3.51% and 14.34%, depending on the dataset. (C) 2019 Elsevier B.V. All rights reserved.

引用

页数：13

共 81 条

[1] End-to-end environmental sound classification using a 1D convolutional neural network [J].

Abdoli, Sajjad ;

Cardinal, Patrick ;

Koerich, Alessandro Lameiras .

EXPERT SYSTEMS WITH APPLICATIONS, 2019, 136 :252-263

[2]

[Anonymous], WORKSH DET CLASS AC

[3]

[Anonymous], 7 INT C SPOKEN LANGU

[4]

[Anonymous], 2011, arXiv preprint arXiv:1102.0183

[5]

[Anonymous], 2017, P IEEE C COMP VIS PA

[6]

[Anonymous], 2013, 14th International Society for Music Information Retrieval Conference (ISMIR-2013)

[7]

[Anonymous], 2015, 16 INT SOC MUS INF R

[8]

[Anonymous], 2006, Digital Image Processing

[9]

[Anonymous], 2019, ARXIV190411649

[10]

[Anonymous], 2008, COMPUT VIS IMAGE UND

← 1 2 3 4 5 6 7 8 9 →