Birdsong classification based on multi-feature fusion

被引:0
作者
Na Yan
Aibin Chen
Guoxiong Zhou
Zhiqiang Zhang
Xiangyong Liu
Jianwu Wang
Zhihua Liu
Wenjie Chen
机构
[1] Central South University of Forestry and Technology,Institute of Artificial Intelligence Application, College of Computer and Information Engineering
[2] Central South University of Forestry and Technology,Wildlife Conservation and Utilization Laboratory, College of Forestry
[3] Central South University of Forestry and Technology,Hunan Provincial Key Laboratory of Urban Forest Ecology, College of Life Science and Technology
[4] Hunan Zixing Artificial Intelligence Research Academy,undefined
[5] HuangFengQiao State-Owned Forest Farm,undefined
[6] YouXian County,undefined
来源
Multimedia Tools and Applications | 2021年 / 80卷
关键词
Birdsong classification; Acoustic feature; Feature fusion; 3DCNN-LSTM;
D O I
暂无
中图分类号
学科分类号
摘要
The classification of birdsong has very important signification to monitor the bird population in the habitats. Aiming at the birdsong dataset with complex and diverse audio background, this paper attempts to introduce an acoustic feature for voice and music analysis: Chroma. It is spliced and fused with the commonly used birdsong features, Log-Mel Spectrogram (LM) and Mel Frequency Cepstrum Coefficient (MFCC), to enrich the representational capacity of single feature; At the same time, in view of the characteristic that birdsong has continuous and dynamic changes in time, a 3DCNN-LSTM combined model is proposed as a classifier to make the network more sensitive to the birdsong information that changes with time. In this paper, we selected four bird audio data from the Xeno-Canto website to evaluate how LM, MFCC and Chroma were fused to maximize the birdsong audio information. The experimental results show that the LM-MFCC-C feature combination achieves the best result of 97.9% mean average precision (mAP) in the experiment.
引用
收藏
页码:36529 / 36547
页数:18
相关论文
共 58 条
[1]  
Bardeli R(2010)Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring Pattern Recogn Lett 31 1524-1534
[2]  
Wolff D(2017)Classifying environmental sounds using image recognition networks Procedia computer science 112 2048-2056
[3]  
Kurth F(2007)(2007) Bird species recognition using support vector machines EURASIP J Adv in Signal Process 1 221-231
[4]  
Boddapati V(2012)3D convolutional neural networks for human action recognition IEEE Trans Pattern Anal Mach Intell 35 217-226
[5]  
Petef A(2015)Towards the automated detection and occupancy estimation of primates using passive acoustic monitoring Ecol Ind 54 90-94
[6]  
Rasmusson J(2018)Random Forest Algorithm for Recognition of Bird Species using Audio Recordings Int J Manage, Tech And Engr 8 1152-101
[7]  
Fagerlund S(2018)An ensemble stacked convolutional neural network model for environmental event sound recognition Appl Sci 8 93-464
[8]  
Ji S(2006)Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysis[J] Pattern Recogn Lett 27 454-129
[9]  
Xu W(2012)Continuous birdsong recognition using Gaussian mixture modeling of image shape features IEEE Trans Multimedia 15 123-565
[10]  
Yang M(2006)Towards the global monitoring of biodiversity change Trends Ecol Evol 21 543-2304