Multi-label classification for acoustic bird species detection using transfer learning approach

被引:20
作者
Swaminathan, Bhuvaneswari [1 ]
Jagadeesh, M. [1 ]
Vairavasundaram, Subramaniyaswamy [1 ]
机构
[1] SASTRA Deemed Univ, Sch Comp, Thanjavur 613401, India
关键词
Wav2vec; Transformers; Transfer learning; Multi-label; Bird species classification; Audio classification; RECOGNITION;
D O I
10.1016/j.ecoinf.2024.102471
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
As part of ornithology, bird species classification is vital to understanding species distribution, habitat requirements and environmental changes that affect bird populations. It is possible for ornithologists to assess the health of a certain habitat by tracking changes in bird species distributions. This work has extended an efficient transfer learning technique for labelling and classifying multiple bird species from real-time audio recordings. For this purpose, Wav2vec is fine-tuned using the back propagation technique, which makes the feature extractor more effective in learning each bird's pitch and other sound characteristics. To perform the task, each audio recording has been clipped as chunks from the overlapping audio to determine multi-labels from it. Through the application of transfer learning, the features of audio recordings have been automatically extracted for classification and fed to a feed-forward network. Subsequently, probabilities associated with each audio segment is aggregated through the clipping approach to represent multiple species of bird call. These probability scores are then used to determine the presence of predominant bird species in the audio recording for multi-labelling. The proposed Wav2vec demonstrates remarkable performance, achieving an F1-score of 0.89 using the Xeno-Canto dataset in which outperforming other multi-label classifiers.
引用
收藏
页数:12
相关论文
共 36 条
[1]   An automated multispecies bioacoustics sound classification method based on a nonlinear pattern: Twine-pat [J].
Akbal, Erhan ;
Dogan, Sengul ;
Tuncer, Turker .
ECOLOGICAL INFORMATICS, 2022, 68
[2]   A Hybrid CNN and RNN Variant Model for Music Classification [J].
Ashraf, Mohsin ;
Abid, Fazeel ;
Din, Ikram Ud ;
Rasheed, Jawad ;
Yesiltepe, Mirsat ;
Yeo, Sook Fern ;
Ersoy, Merve T. .
APPLIED SCIENCES-BASEL, 2023, 13 (03)
[3]  
Ayadi Souha, 2022, PROC 6 INT C ADV TEC, P1
[4]  
Baevski A, 2020, ADV NEUR IN, V33
[5]  
Boigne J., 2020, Recognizing More Emotions with Less Data Using Self-Supervised Transfer Learning
[6]   Automated sound recording and analysis techniques for bird surveys and conservation [J].
Brandes, T. Scott .
BIRD CONSERVATION INTERNATIONAL, 2008, 18 :S163-S173
[7]  
Dosovitskiy A., 2021, 9 INT C LEARN REPR I
[8]   Data-Efficient Classification of Birdcall Through Convolutional Neural Networks Transfer Learning [J].
Efremova, Dina B. ;
Sankupellay, Mangalam ;
Konovalov, Dmitry A. .
2019 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2019, :294-301
[9]   A Randomized Bag-of-Birds Approach to Study Robustness of Automated Audio Based Bird Species Classification [J].
Ghani, Burooj ;
Hallerberg, Sarah .
APPLIED SCIENCES-BASEL, 2021, 11 (19)
[10]  
Ghosal D, 2018, INTERSPEECH, P2087