Automatic taxonomic identification based on the Fossil Image Dataset (>415,000 images) and deep convolutional neural networks

被引:33
作者
Liu, Xiaokang [1 ]
Jiang, Shouyi [1 ]
Wu, Rui [1 ]
Shu, Wenchao [1 ]
Hou, Jie [1 ]
Sun, Yongfang [1 ]
Sun, Jiarui [1 ]
Chu, Daoliang [1 ]
Wu, Yuyang [1 ]
Song, Haijun [1 ]
机构
[1] China Univ Geosci, Sch Earth Sci, State Key Lab Biogeol & Environm Geol, Wuhan 430074, Peoples R China
基金
中国国家自然科学基金;
关键词
CLASSIFICATION; FORAMINIFERA; RECOGNITION;
D O I
10.1017/pab.2022.14
中图分类号
X176 [生物多样性保护];
学科分类号
090705 ;
摘要
The rapid and accurate taxonomic identification of fossils is of great significance in paleontology, biostratigraphy, and other fields. However, taxonomic identification is often labor-intensive and tedious, and the requisition of extensive prior knowledge about a taxonomic group also requires long-term training. Moreover, identification results are often inconsistent across researchers and communities. Accordingly, in this study, we used deep learning to support taxonomic identification. We used web crawlers to collect the Fossil Image Dataset (FID) via the Internet, obtaining 415,339 images belonging to 50 fossil clades. Then we trained three powerful convolutional neural networks on a high-performance workstation. The Inception-ResNet-v2 architecture achieved an average accuracy of 0.90 in the test dataset when transfer learning was applied. The clades of microfossils and vertebrate fossils exhibited the highest identification accuracies of 0.95 and 0.90, respectively. In contrast, clades of sponges, bryozoans, and trace fossils with various morphologies or with few samples in the dataset exhibited a performance below 0.80. Visual explanation methods further highlighted the discrepancies among different fossil clades and suggested similarities between the identifications made by machine classifiers and taxonomists. Collecting large paleontological datasets from various sources, such as the literature, digitization of dark data, citizen-science data, and public data from the Internet may further enhance deep learning methods and their adoption. Such developments will also possibly lead to image-based systematic taxonomy to be replaced by machine-aided classification in the future. Pioneering studies can include microfossils and some invertebrate fossils. To contribute to this development, we deployed our model on a server for public access at www.ai-fossil.com.
引用
收藏
页码:1 / 22
页数:22
相关论文
共 105 条
[81]   Improving classification of pollen grain images of the POLEN23E dataset through three different applications of deep learning convolutional neural networks [J].
Sevillano, Victor ;
Aznarte, Jose L. .
PLOS ONE, 2018, 13 (09)
[82]   A survey on Image Data Augmentation for Deep Learning [J].
Shorten, Connor ;
Khoshgoftaar, Taghi M. .
JOURNAL OF BIG DATA, 2019, 6 (01)
[83]  
Simonyan K., 2013, ARXIV13124400 CSCV
[84]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[85]   Zooniverse: Observing the World's Largest Citizen Science Platform [J].
Simpson, Robert ;
Page, Kevin R. ;
De Roure, David .
WWW'14 COMPANION: PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON WORLD WIDE WEB, 2014, :1049-1054
[86]   A systematic analysis of performance measures for classification tasks [J].
Sokolova, Marina ;
Lapalme, Guy .
INFORMATION PROCESSING & MANAGEMENT, 2009, 45 (04) :427-437
[87]   eBird: A citizen-based bird observation network in the biological sciences [J].
Sullivan, Brian L. ;
Wood, Christopher L. ;
Iliff, Marshall J. ;
Bonney, Rick E. ;
Fink, Daniel ;
Kelling, Steve .
BIOLOGICAL CONSERVATION, 2009, 142 (10) :2282-2292
[88]  
Szegedy C, 2017, AAAI CONF ARTIF INTE, P4278
[89]  
Szegedy Christian, 2015, IEEE C COMPUTER VISI, P1, DOI [10.1109/cvpr.2015.7298594, DOI 10.1109/CVPR.2015.7298594]
[90]   Machine learning to classify animal species in camera trap images: Applications in ecology [J].
Tabak, Michael A. ;
Norouzzadeh, Mohammad S. ;
Wolfson, David W. ;
Sweeney, Steven J. ;
Vercauteren, Kurt C. ;
Snow, Nathan P. ;
Halseth, Joseph M. ;
Di Salvo, Paul A. ;
Lewis, Jesse S. ;
White, Michael D. ;
Teton, Ben ;
Beasley, James C. ;
Schlichting, Peter E. ;
Boughton, Raoul K. ;
Wight, Bethany ;
Newkirk, Eric S. ;
Ivan, Jacob S. ;
Odell, Eric A. ;
Brook, Ryan K. ;
Lukacs, Paul M. ;
Moeller, Anna K. ;
Mandeville, Elizabeth G. ;
Clune, Jeff ;
Miller, Ryan S. .
METHODS IN ECOLOGY AND EVOLUTION, 2019, 10 (04) :585-590