Convolutional Neural Networks for Scops Owl Sound Classification

被引：26

作者：

Hidayat, Alam Ahmad ^{[1
]}

Cenggoro, Tjeng Wawan ^{[1
,2
]}

Pardamean, Bens ^{[1
,3
]}

机构：

[1] Bina Nusantara Univ, Bioinformat & Data Sci Res Ctr, Jakarta 11480, Indonesia

[2] Bina Nusantara Univ, Sch Comp Sci, Comp Sci Dept, Jakarta 11480, Indonesia

[3] Bina Nusantara Univ, Comp Sci Dept, BINUS Grad Program Master Comp Sci, Jakarta 11480, Indonesia

来源：

5TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE 2020 | 2021年 / 179卷

关键词：

acoustic features; bird sound classification; convolutional neural network; mean average precision; scops owl;

D O I：

10.1016/j.procs.2021.12.010

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Adopting a deep learning model into bird sound classification tasks becomes a common practice in order to construct a robust automated bird sound detection system. In this paper, we employ a four-layer Convolutional Neural Network (CNN) formulated to classify different species of Indonesia scops owls based on their vocal sounds. Two widely used representations of an acoustic signal: log-scaled mel-spectrogram and Mel Frequency Cepstral Coefficient (MFCC) are extracted from each sound file and fed into the network separately to compare the model performance with different inputs. A more complex CNN that can simultaneously process the two extracted acoustic representations is proposed to provide a direct comparison with the baseline model. The dual-input network is the well-performing model in our experiment that achieves 97.55% Mean Average Precision (MAP). Meanwhile, the baseline model achieves a MAP score of 94.36% for the mel-spectrogram input and 96.08% for the MFCC input. (C) 2021 The Authors. Published by Elsevier B.V.

引用

页码：81 / 87

页数：7

共 21 条

[1] End-to-end environmental sound classification using a 1D convolutional neural network [J].

Abdoli, Sajjad ;

Cardinal, Patrick ;

Koerich, Alessandro Lameiras .

EXPERT SYSTEMS WITH APPLICATIONS, 2019, 136 :252-263

[2] Automated classification of bird and amphibian calls using machine learning: A comparison of methods [J].

Acevedo, Miguel A. ;

Corrada-Bravo, Carlos J. ;

Corrada-Bravo, Hector ;

Villanueva-Rivera, Luis J. ;

Aide, T. Mitchell .

ECOLOGICAL INFORMATICS, 2009, 4 (04) :206-214

[3]

Adavanne S, 2017, EUR SIGNAL PR CONF, P1729, DOI 10.23919/EUSIPCO.2017.8081505

[4]

[Anonymous], 2008, Owls: A guide to the owls of the world

[5] Automated sound recording and analysis techniques for bird surveys and conservation [J].

Brandes, T. Scott .

BIRD CONSERVATION INTERNATIONAL, 2008, 18 :S163-S173

[6]

Cakir E, 2017, EUR SIGNAL PR CONF, P1744, DOI 10.23919/EUSIPCO.2017.8081508

[7]

Goeau H, 2018, CLEF C LABS EV FOR S

[8]

Goodfellow I, 2016, ADAPT COMPUT MACH LE, P1

[9] Analysis of Acoustic Features in Gender Identification Model for English and Bahasa Indonesia Telephone Speeches [J].

Kacamarga, Muhamad Fitra ;

Cenggoro, Tjeng Wawan ;

Budiarto, Arif ;

Rahutomo, Reza ;

Pardamean, Bens .

4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND COMPUTATIONAL INTELLIGENCE (ICCSCI 2019) : ENABLING COLLABORATION TO ESCALATE IMPACT OF RESEARCH RESULTS FOR SOCIETY, 2019, 157 :199-204

[10]

Kahl S., 2017, CLEF (working notes) 1866

← 1 2 3 →