Passive acoustic monitoring of animal populations with transfer learning

被引:54
作者
Dufourq, Emmanuel [1 ,2 ,3 ]
Batist, Carly [4 ]
Foquet, Ruben [5 ]
Durbach, Ian [6 ,7 ]
机构
[1] Stellenbosch Univ, Stellenbosch, South Africa
[2] African Inst Math Sci, Cape Town, South Africa
[3] Natl Inst Theoret & Computat Sci, Stellenbosch, South Africa
[4] CUNY, Grad Ctr, New York, NY USA
[5] Biodivers Inventory Conservat, Glabbeek, Belgium
[6] Univ St Andrews, Ctr Res Ecol & Environm Modelling, St Andrews, Fife, Scotland
[7] Univ Cape Town, Ctr Stat Ecol Environm & Conservat, Cape Town, South Africa
关键词
Transfer learning; Convolutional neural networks; Deep learning; Vocalisation classification; Bioacoustics; IDENTIFICATION; CLASSIFICATION;
D O I
10.1016/j.ecoinf.2022.101688
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Progress in deep learning, more specifically in using convolutional neural networks (CNNs) for the creation of classification models, has been tremendous in recent years. Within bioacoustics research, there has been a large number of recent studies that use CNNs. Designing CNN architectures from scratch is non-trivial and requires knowledge of machine learning. Furthermore, hyper-parameter tuning associated with CNNs is extremely time consuming and requires expensive hardware. In this paper we assess whether it is possible to build good bioacoustic classifiers by adapting and re-using existing CNNs pre-trained on the ImageNet dataset - instead of designing them from scratch, a strategy known as transfer learning that has proved highly successful in other domains. This study is a first attempt to conduct a large-scale investigation on how transfer learning can be used for passive acoustic monitoring (PAM), to simplify the implementation of CNNs and the design decisions when creating them, and to remove time consuming hyper-parameter tuning phases. We compare 12 modern CNN architectures across 4 passive acoustic datasets that target calls of the Hainan gibbon Nomascus hainanus, the critically endangered black-and-white ruffed lemur Varecia variegata, the vulnerable Thyolo alethe Chamaetylas choloensis, and the Pin-tailed whydah Vidua macroura. We focus our work on data scarcity issues by training PAM binary classification models very small datasets, with as few as 25 verified examples. Our findings reveal that transfer learning can result in up to 82% F1 score while keeping CNN implementation details to a minimum, thus rendering this approach accessible, easier to design, and speeding up further vocalisation annotations to create PAM robust models.
引用
收藏
页数:12
相关论文
共 64 条
[1]  
Abadi M, 2016, PROCEEDINGS OF OSDI'16: 12TH USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, P265
[2]   One-shot learning for acoustic identification of bird species in non-stationary environments [J].
Acconcjaioco, Michelangelo ;
Ntalampiras, Stavros .
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :755-762
[3]   Compensating class imbalance for acoustic chimpanzee detection with convolutional recurrent neural networks [J].
Anders, Franz ;
Kalan, Ammie K. ;
Kuehl, Hjalmar S. ;
Fuchs, Mirco .
ECOLOGICAL INFORMATICS, 2021, 65
[4]  
[Anonymous], 1989, NEURIPS
[5]   Study of 3D-printed chitosan scaffold features after different post-printing gelation processes [J].
Bergonzi, Carlo ;
Di Natale, Antonina ;
Zimetti, Francesca ;
Marchi, Cinzia ;
Bianchera, Annalisa ;
Bernini, Franco ;
Silvestri, Marco ;
Bettini, Ruggero ;
Elviri, Lisa .
SCIENTIFIC REPORTS, 2019, 9 (1)
[6]   Boosting Handwriting Text Recognition in Small Databases with Transfer Learning [J].
Carlos Aradillas, Jose ;
Jose Murillo-Fuentes, Juan ;
Olmos, Pablo M. .
PROCEEDINGS 2018 16TH INTERNATIONAL CONFERENCE ON FRONTIERS IN HANDWRITING RECOGNITION (ICFHR), 2018, :429-434
[7]  
Chicco D, 2021, METHODS MOL BIOL, V2190, P73, DOI 10.1007/978-1-0716-0826-5_3
[8]   Xception: Deep Learning with Depthwise Separable Convolutions [J].
Chollet, Francois .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1800-1807
[9]  
Çoban EB, 2020, INT CONF ACOUST SPEE, P726, DOI [10.1109/ICASSP40776.2020.9053338, 10.1109/icassp40776.2020.9053338]
[10]  
Deng J, 2009, IEEE C COMP VIS PATT, P248, DOI DOI 10.1109/CVPR.2009.5206848