Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition Under Occlusion

被引：51

作者：

Kortylewski, Adam ^{[1
]}

Liu, Qing ^{[1
]}

Wang, Angtian ^{[1
]}

Sun, Yihong ^{[1
]}

Yuille, Alan ^{[1
]}

机构：

[1] Johns Hopkins Univ, Baltimore, MD 21218 USA

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2021年 / 129卷 / 03期

基金：

瑞士国家科学基金会;

关键词：

Compositional models; Robustness to partial occlusion; Image classification; Object detection; Out-of-distribution generalization;

D O I：

10.1007/s11263-020-01401-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Computer vision systems in real-world applications need to be robust to partial occlusion while also being explainable. In this work, we show that black-box deep convolutional neural networks (DCNNs) have only limited robustness to partial occlusion. We overcome these limitations by unifying DCNNs with part-based models into Compositional Convolutional Neural Networks (CompositionalNets)-an interpretable deep architecture with innate robustness to partial occlusion. Specifically, we propose to replace the fully connected classification head of DCNNs with a differentiable compositional model that can be trained end-to-end. The structure of the compositional model enables CompositionalNets to decompose images into objects and context, as well as to further decompose object representations in terms of individual parts and the objects' pose. The generative nature of our compositional model enables it to localize occluders and to recognize objects based on their non-occluded parts. We conduct extensive experiments in terms of image classification and object detection on images of artificially occluded objects from the PASCAL3D+ and ImageNet dataset, and real images of partially occluded vehicles from the MS-COCO dataset. Our experiments show that CompositionalNets made from several popular DCNN backbones (VGG-16, ResNet50, ResNext) improve by a large margin over their non-compositional counterparts at classifying and detecting partially occluded objects. Furthermore, they can localize occluders accurately despite being trained with class-level supervision only. Finally, we demonstrate that CompositionalNets provide human interpretable predictions as their individual components can be understood as detecting parts and estimating an objects' viewpoint.

引用

页码：736 / 760

页数：25

共 50 条

[31] VFM: Visual Feedback Model for Robust Object Recognition
Wang, Chong
Huang, Kai-Qi
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2015, 30 (02) : 325 - 339
[32] VFM: Visual Feedback Model for Robust Object Recognition
Chong Wang
Kai-Qi Huang
Journal of Computer Science and Technology, 2015, 30 : 325 - 339
[33] Weather Recognition Based on Edge Deterioration and Convolutional Neural Networks
Shi, Yuzhou
Li, Yuanxiang
Liu, Jiawei
Liu, Xingang
Murphey, Yi Lu
2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 2438 - 2443
[34] Robustness of convolutional neural networks in recognition of pigmented skin lesions
Maron, Roman C.
Haggenmueller, Sarah
von Kalle, Christof
Utikal, Jochen S.
Meier, Friedegund
Gellrich, Frank F.
Hauschild, Axel
French, Lars E.
Schlaak, Max
Ghoreschi, Kamran
Kutzner, Heinz
Heppt, Markus V.
Haferkamp, Sebastian
Sondermann, Wiebke
Schadendorf, Dirk
Schilling, Bastian
Hekler, Achim
Krieghoff-Henning, Eva
Kather, Jakob N.
Froehling, Stefan
Lipka, Daniel B.
Brinker, Titus J.
EUROPEAN JOURNAL OF CANCER, 2021, 145 : 81 - 91
[35] Explainable deep convolutional neural networks for insect pest recognition
Coulibaly, Solemane
Kamsu-Foguem, Bernard
Kamissoko, Dantouma
Traore, Daouda
JOURNAL OF CLEANER PRODUCTION, 2022, 371
[36] Object Detection utilizing Modified Auto Encoder and Convolutional Neural Networks
Nourmohammadi-Khiarak, Jalil
Mazaheri, Samaneh
Moosavi-Tayebi, Rohollah
Noorbakhsh-Devlagh, Hamid
2018 SIGNAL PROCESSING: ALGORITHMS, ARCHITECTURES, ARRANGEMENTS, AND APPLICATIONS (SPA), 2018, : 43 - 49
[37] Improved Object Detection With Iterative Localization Refinement in Convolutional Neural Networks
Cheng, Kai-Wen
Chen, Yie-Tarng
Fang, Wen-Hsien
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (09) : 2261 - 2275
[38] Using convolutional neural networks for image semantic segmentation and object detection
Li, Shuangmei
Huang, Chengning
SYSTEMS AND SOFT COMPUTING, 2024, 6
[39] Object Detectionin of Remote Sensing Images Based on Convolutional Neural Networks
Ou Pan
Zhang Zheng
Lu Kui
Liu Zeyang
LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (05)
[40] A Study on Object Classification Using Deep Convolutional Neural Networks and Comparison with Shallow Networks
Erdas, Ali
Arslan, Erhan
Ozturkcan, Berkay
Yildiran, Ugur
2018 6TH INTERNATIONAL CONFERENCE ON CONTROL ENGINEERING & INFORMATION TECHNOLOGY (CEIT), 2018,

← 1 2 3 4 5 →