Understanding Object Detection Through an Adversarial Lens

被引:18
作者
Chow, Ka-Ho [1 ]
Liu, Ling [1 ]
Gursoy, Mehmet Emre [1 ]
Truex, Stacey [1 ]
Wei, Wenqi [1 ]
Wu, Yanzhao [1 ]
机构
[1] Georgia Inst Technol, Atlanta, GA 30332 USA
来源
COMPUTER SECURITY - ESORICS 2020, PT II | 2020年 / 12309卷
基金
美国国家科学基金会;
关键词
Adversarial robustness; Object detection; Attack evaluation framework; Deep neural networks;
D O I
10.1007/978-3-030-59013-0_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks based object detection models have revolutionized computer vision and fueled the development of a wide range of visual recognition applications. However, recent studies have revealed that deep object detectors can be compromised under adversarial attacks, causing a victim detector to detect no object, fake objects, or mislabeled objects. With object detection being used pervasively in many security-critical applications, such as autonomous vehicles and smart cities, we argue that a holistic approach for an in-depth understanding of adversarial attacks and vulnerabilities of deep object detection systems is of utmost importance for the research community to develop robust defense mechanisms. This paper presents a framework for analyzing and evaluating vulnerabilities of the state-of-the-art object detectors under an adversarial lens, aiming to analyze and demystify the attack strategies, adverse effects, and costs, as well as the cross-model and cross-resolution transferability of attacks. Using a set of quantitative metrics, extensive experiments are performed on six representative deep object detectors from three popular families (YOLOv3, SSD, and Faster R-CNN) with two benchmark datasets (PASCAL VOC and MS COCO). We demonstrate that the proposed framework can serve as a methodical benchmark for analyzing adversarial behaviors and risks in real-time object detection systems. We conjecture that this framework can also serve as a tool to assess the security risks and the adversarial robustness of deep object detectors to be deployed in real-world applications.
引用
收藏
页码:460 / 481
页数:22
相关论文
共 28 条
[1]   Optimizing Video Object Detection via a Scale-Time Lattice [J].
Chen, Kai ;
Wang, Jiaqi ;
Yang, Shuo ;
Zhang, Xingcheng ;
Xiong, Yuanjun ;
Loy, Chen Change ;
Lin, Dahua .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :7814-7823
[2]  
Chow KH, 2020, Arxiv, DOI arXiv:2004.04320
[3]  
Chow KH, 2019, IEEE INT CONF BIG DA, P1282, DOI [10.1109/BigData47090.2019.9006090, 10.1109/bigdata47090.2019.9006090]
[4]   The PASCAL Visual Object Classes Challenge: A Retrospective [J].
Everingham, Mark ;
Eslami, S. M. Ali ;
Van Gool, Luc ;
Williams, Christopher K. I. ;
Winn, John ;
Zisserman, Andrew .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136
[5]  
Eykholt K, 2018, Arxiv, DOI [arXiv:1807.07769, 10.48550/ARXIV.1807.07769, DOI 10.48550/ARXIV.1807.07769]
[6]   Human Detection and Tracking for Video Surveillance: A Cognitive Science Approach [J].
Gajjar, Vandit ;
Gurnani, Ayesha ;
Khandhediya, Yash .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :2805-2809
[7]   Fast R-CNN [J].
Girshick, Ross .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1440-1448
[8]   Rich feature hierarchies for accurate object detection and semantic segmentation [J].
Girshick, Ross ;
Donahue, Jeff ;
Darrell, Trevor ;
Malik, Jitendra .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :580-587
[9]  
Goodfellow I., 2015, INT C LEARNING REPRE, P1
[10]  
Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672