Automatically identifying, counting, and describing wild animals in camera-trap images with deep learning

被引：656

作者：

Norouzzadeh, Mohammad Sadegh ^{[1
]}

Anh Nguyen ^{[2
]}

Kosmala, Margaret ^{[3
]}

Swanson, Alexandra ^{[4
]}

Palmer, Meredith S. ^{[5
]}

Packer, Craig ^{[5
]}

Clune, Jeff ^{[1
,6
]}

机构：

[1] Univ Wyoming, Dept Comp Sci, Laramie, WY 82071 USA

[2] Auburn Univ, Dept Comp Sci & Software Engn, Auburn, AL 36849 USA

[3] Harvard Univ, Dept Organism & Evolutionary Biol, Cambridge, MA 02138 USA

[4] Univ Oxford, Dept Phys, Oxford OX1 3RH, England

[5] Univ Minnesota, Dept Ecol Evolut & Behav, St Paul, MN 55108 USA

[6] Uber Al Labs, San Francisco, CA 94103 USA

来源：

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA | 2018年 / 115卷 / 25期

基金：

美国国家科学基金会;

关键词：

deep learning; deep neural networks; artificial intelligence; camera-trap images; wildlife ecology; MANAGEMENT; LANDSCAPE; SOFTWARE; MODEL; FEAR;

D O I：

10.1073/pnas.1719367115

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Having accurate, detailed, and up-to-date information about the location and behavior of animals in the wild would improve our ability to study and conserve ecosystems. We investigate the ability to automatically, accurately, and inexpensively collect such data, which could help catalyze the transformation of many fields of ecology, wildlife biology, zoology, conservation biology, and animal behavior into "big data" sciences. Motion-sensor "camera traps" enable collecting wildlife pictures inexpensively, unobtrusively, and frequently. However, extracting information from these pictures remains an expensive, time-consuming, manual task. We demonstrate that such information can be automatically extracted by deep learning, a cutting-edge type of artificial intelligence. We train deep convolutional neural networks to identify, count, and describe the behaviors of 48 species in the 3.2 million-image Snapshot Serengeti dataset. Our deep neural networks automatically identify animals with >93.8% accuracy, and we expect that number to improve rapidly in years to come. More importantly, if our system classifies only images it is confident about, our system can automate animal identification for 99.3% of the data while still performing at the same 96.6% accuracy as that of crowdsourced teams of human volunteers, saving >8.4 y (i.e., >17,000 h at 40 h/wk) of human labeling effort on this 3.2 million-image dataset. Those efficiency gains highlight the importance of using deep neural networks to automate data extraction from camera-trap images, reducing a roadblock for this widely used technology. Our results suggest that deep learning could enable the inexpensive, unobtrusive, high-volume, and even real-time collection of a wealth of information about vast numbers of animals in the wild.

引用

页码：E5716 / E5725

页数：10

共 63 条

[1] The spatial distribution of African savannah herbivores: species associations and habitat occupancy in a landscape context [J].

Anderson, T. Michael ;

White, Staci ;

Davis, Bryant ;

Erhardt, Rob ;

Palmer, Meredith ;

Swanson, Alexandra ;

Kosmala, Margaret ;

Packer, Craig .

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY B-BIOLOGICAL SCIENCES, 2016, 371 (1703)

[2]

[Anonymous], 2008 INT C MACH LEAR

[3]

[Anonymous], 2018, INT C LEARN REPR

[4]

[Anonymous], AFR J ECOL

[5]

[Anonymous], 2016, P 2016 IEEE C COMPUT

[6]

[Anonymous], 1990, Neurocomputing: Algorithms, architectures and applications

[7]

[Anonymous], 2014, 2014 ADV NEURAL INFO

[8]

[Anonymous], 2009, CVPR

[9]

[Anonymous], 2015, 2015 IEEE C COMP VIS

[10]

[Anonymous], 2016, 2016 IEEE INT C AC S

← 1 2 3 4 5 6 7 →