Sound Classification Using Convolutional Neural Network and Tensor Deep Stacking Network

被引：118

作者：

Khamparia, Aditya ^{[1
]}

Gupta, Deepak ^{[2
]}

Nhu Gia Nguyen ^{[3
]}

Khanna, Ashish ^{[2
]}

Pandey, Babita ^{[4
]}

Tiwari, Prayag ^{[5
]}

机构：

[1] Lovely Profess Univ, Sch Comp Sci & Engn, Phagwara 144401, India

[2] Maharaja Agrasen Inst Technol, New Delhi 110086, India

[3] Duy Tan Univ, Grad Sch, Comp Sci, Da Nang 550000, Vietnam

[4] Babasaheb Bhimrao Ambedkar Univ, Dept Comp & Informat Technol, Lucknow 226025, Uttar Pradesh, India

[5] Univ Padua, Dept Informat Engn, I-35131 Padua, Italy

来源：

IEEE ACCESS | 2019年 / 7卷

关键词：

Deep learning; convolutional neural network; tensor deep stacking networks; spectrograms; RECOGNITION; DIAGNOSIS; SEARCH;

D O I：

10.1109/ACCESS.2018.2888882

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In every aspect of human life, sound plays an important role. From personal security to critical surveillance, sound is a key element to develop the automated systems for these fields. Few systems are already in the market, but their efficiency is a point of concern for their implementation in real-life scenarios. The learning capabilities of the deep learning architectures can be used to develop the sound classification systems to overcome efficiency issues of the traditional systems. Our aim, in this paper, is to use the deep learning networks for classifying the environmental sounds based on the generated spectrograms of these sounds. We used the spectrogram images of environmental sounds to train the convolutional neural network (CNN) and the tensor deep stacking network (TDSN). We used two datasets for our experiment: ESC-10 and ESC-50. Both systems were trained on these datasets, and the achieved accuracy was 77% and 49% in CNN and 56% in TDSN trained on the ESC-10. From this experiment, it is concluded that the proposed approach for sound classification using the spectrogram images of sounds can be efficiently used to develop the sound classification and recognition systems.

引用

页码：7717 / 7727

页数：11

共 50 条

[1] DETECTION, ESTIMATION, AND CLASSIFICATION WITH SPECTROGRAMS [J].

ALTES, RA .

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, 1980, 67 (04) :1232-1246

[2]

[Anonymous], J COMPUTER APPL, DOI DOI 10.5120/IJCA2019918450

[3]

[Anonymous], 1995, CONVOLUTIONAL NETWOR

[4]

[Anonymous], 2017, INT J ARTIFICIAL INT

[5]

[Anonymous], 2012, ABS12070580 CORR

[6] AdaBoost-based artificial neural network learning [J].

Baig, Mirza M. ;

Awais, Mian M. ;

El-Alfy, El-Sayed M. .

NEUROCOMPUTING, 2017, 248 :120-126

[7] Learning Deep Architectures for AI [J].

Bengio, Yoshua .

FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2009, 2 (01) :1-127

[8] Learning Rotation-Invariant Convolutional Neural Networks for Object Detection in VHR Optical Remote Sensing Images [J].

Cheng, Gong ;

Zhou, Peicheng ;

Han, Junwei .

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2016, 54 (12) :7405-7415

[9] Environmental Sound Recognition With Time-Frequency Audio Features [J].

Chu, Selina ;

Narayanan, Shrikanth ;

Kuo, C. -C. Jay .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (06) :1142-1158

[10]

Deng L, 2013, INT CONF ACOUST SPEE, P3153, DOI 10.1109/ICASSP.2013.6638239

← 1 2 3 4 5 →