Image-based taxonomic classification of bulk insect biodiversity samples using deep learning and domain adaptation

被引:6
作者
Fujisawa, Tomochika [1 ,6 ]
Noguerales, Victor [2 ,3 ,7 ]
Meramveliotakis, Emmanouil [2 ]
Papadopoulou, Anna [2 ]
Vogler, Alfried P. [4 ,5 ]
机构
[1] Shiga Univ, Ctr Data Sci Educ & Res, Hikone, Japan
[2] Univ Cyprus, Dept Biol Sci, Nicosia, Cyprus
[3] Inst Prod Nat & Agrobiol IPNA CSIC, Tenerife, Spain
[4] Nat Hist Museum, Dept Life Sci, London, England
[5] Imperial Coll London, Dept Life Sci, Silwood Pk Campus, Ascot, England
[6] Shiga Univ, Ctr Data Sci Educ & Res, 1-1-1 Banba, Hikone, Shiga 5228522, Japan
[7] Inst Prod Nat & Agrobiol IPNA CSIC, Astrofis Francisco Sanchez 3, Tenerife 38206, Spain
关键词
biodiversity assessment; bulk sample; coleoptera; convolutional neural network; domain adaptation; image classification; machine learning; DIVERSITY; SHOW;
D O I
10.1111/syen.12583
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Complex bulk samples of insects from biodiversity surveys present a challenge for taxonomic identification, which could be overcome by high-throughput imaging combined with machine learning for rapid classification of specimens. These procedures require that taxonomic labels from an existing source data set are used for model training and prediction of an unknown target sample. However, such transfer learning may be problematic for the study of new samples not previously encountered in an image set, for example, from unexplored ecosystems, and require methods of domain adaptation that reduce the differences in the feature distribution of the source and target domains (training and test sets). We assessed the efficiency of domain adaptation for family-level classification of bulk samples of Coleoptera, as a critical first step in the characterization of biodiversity samples. Neural network models trained with images from a global database of Coleoptera were applied to a biodiversity sample from understudied forests in Cyprus as the target. Within-dataset classification accuracy reached 98% and depended on the number and quality of training images, and on dataset complexity. The accuracy of between-datasets predictions (across disparate source-target pairs that do not share any species or genera) was at most 82% and depended greatly on the standardization of the imaging procedure. An algorithm for domain adaptation, domain adversarial training of neural networks (DANN), significantly improved the prediction performance of models trained by non-standardized, low-quality images. Our findings demonstrate that existing databases can be used to train models and successfully classify images from unexplored biota, but the imaging conditions and classification algorithms need careful consideration.
引用
收藏
页码:387 / 401
页数:15
相关论文
共 43 条
  • [31] Benchmark database for fine-grained image classification of benthic macroinvertebrates
    Raitoharju, Jenni
    Riabchenko, Ekaterina
    Ahmad, Iftikhar
    Iosifidis, Alexandros
    Gabbouj, Moncef
    Kiranyaz, Serkan
    Tirronen, Ville
    Arje, Johanna
    Karkkainen, Salme
    Meissner, Kristian
    [J]. IMAGE AND VISION COMPUTING, 2018, 78 : 73 - 83
  • [32] CNN Features off-the-shelf: an Astounding Baseline for Recognition
    Razavian, Ali Sharif
    Azizpour, Hossein
    Sullivan, Josephine
    Carlsson, Stefan
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2014, : 512 - 519
  • [33] Improving the taxonomy of fossil pollen using convolutional neural networks and superresolution microscopy
    Romero, Ingrid C.
    Kong, Shu
    Fowlkes, Charless C.
    Jaramillo, Carlos
    Urban, Michael A.
    Oboh-Ikuenobe, Francisca
    D'Apolito, Carlos
    Punyasena, Surangi W.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (45) : 28496 - 28505
  • [34] Bulk arthropod abundance, biomass and diversity estimation using deep learning for computer vision
    Schneider, Stefan
    Taylor, Graham W.
    Kremer, Stefan C.
    Burgess, Patrick
    McGroarty, Jillian
    Mitsui, Kyomi
    Zhuang, Alex
    deWaard, Jeremy R.
    Fryxell, John M.
    [J]. METHODS IN ECOLOGY AND EVOLUTION, 2022, 13 (02): : 346 - 357
  • [35] Sashimi: A toolkit for facilitating high-throughput organismal image segmentation using deep learning
    Schwartz, Shawn T.
    Alfaro, Michael E.
    [J]. METHODS IN ECOLOGY AND EVOLUTION, 2021, 12 (12): : 2341 - 2354
  • [36] Beetle assemblages from an Australian tropical rainforest show that the canopy and the ground strata contribute equally to biodiversity
    Stork, Nigel E.
    Grimbacher, Peter S.
    [J]. PROCEEDINGS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2006, 273 (1596) : 1969 - 1975
  • [37] Machine learning to classify animal species in camera trap images: Applications in ecology
    Tabak, Michael A.
    Norouzzadeh, Mohammad S.
    Wolfson, David W.
    Sweeney, Steven J.
    Vercauteren, Kurt C.
    Snow, Nathan P.
    Halseth, Joseph M.
    Di Salvo, Paul A.
    Lewis, Jesse S.
    White, Michael D.
    Teton, Ben
    Beasley, James C.
    Schlichting, Peter E.
    Boughton, Raoul K.
    Wight, Bethany
    Newkirk, Eric S.
    Ivan, Jacob S.
    Odell, Eric A.
    Brook, Ryan K.
    Lukacs, Paul M.
    Moeller, Anna K.
    Mandeville, Elizabeth G.
    Clune, Jeff
    Miller, Ryan S.
    [J]. METHODS IN ECOLOGY AND EVOLUTION, 2019, 10 (04): : 585 - 590
  • [38] Tommasi T, 2017, ADV COMPUT VIS PATT, P37, DOI 10.1007/978-3-319-58347-1_2
  • [39] Torralba A, 2011, PROC CVPR IEEE, P1521, DOI 10.1109/CVPR.2011.5995347
  • [40] Awakening a taxonomist's third eye: exploring the utility of computer vision and deep learning in insect systematics
    Valan, Miroslav
    Vondracek, Dominik
    Ronquist, Fredrik
    [J]. SYSTEMATIC ENTOMOLOGY, 2021, 46 (04) : 757 - 766