Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition Under Occlusion

被引：51

作者：

Kortylewski, Adam ^{[1
]}

Liu, Qing ^{[1
]}

Wang, Angtian ^{[1
]}

Sun, Yihong ^{[1
]}

Yuille, Alan ^{[1
]}

机构：

[1] Johns Hopkins Univ, Baltimore, MD 21218 USA

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2021年 / 129卷 / 03期

基金：

瑞士国家科学基金会;

关键词：

Compositional models; Robustness to partial occlusion; Image classification; Object detection; Out-of-distribution generalization;

D O I：

10.1007/s11263-020-01401-3

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Computer vision systems in real-world applications need to be robust to partial occlusion while also being explainable. In this work, we show that black-box deep convolutional neural networks (DCNNs) have only limited robustness to partial occlusion. We overcome these limitations by unifying DCNNs with part-based models into Compositional Convolutional Neural Networks (CompositionalNets)-an interpretable deep architecture with innate robustness to partial occlusion. Specifically, we propose to replace the fully connected classification head of DCNNs with a differentiable compositional model that can be trained end-to-end. The structure of the compositional model enables CompositionalNets to decompose images into objects and context, as well as to further decompose object representations in terms of individual parts and the objects' pose. The generative nature of our compositional model enables it to localize occluders and to recognize objects based on their non-occluded parts. We conduct extensive experiments in terms of image classification and object detection on images of artificially occluded objects from the PASCAL3D+ and ImageNet dataset, and real images of partially occluded vehicles from the MS-COCO dataset. Our experiments show that CompositionalNets made from several popular DCNN backbones (VGG-16, ResNet50, ResNext) improve by a large margin over their non-compositional counterparts at classifying and detecting partially occluded objects. Furthermore, they can localize occluders accurately despite being trained with class-level supervision only. Finally, we demonstrate that CompositionalNets provide human interpretable predictions as their individual components can be understood as detecting parts and estimating an objects' viewpoint.

引用

页码：736 / 760

页数：25

共 50 条

[41] Object Detection by a Super-Resolution Method and a Convolutional Neural Networks
Na, Bokyoon
Fox, Geoffrey C.
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2263 - 2269
[42] Probabilistic Model of Object Detection Based on Convolutional Neural Network
Li, Fang-Qi
Ren, Xu-Die
Guo, Hao-Nan
COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2019, 463 : 2059 - 2066
[43] ITERATIVE LOCALIZATION REFINEMENT IN CONVOLUTIONAL NEURAL NETWORKS FOR IMPROVED OBJECT DETECTION
Cheng, Kai-Wen
Chen, Yie-Tarng
Fang, Wen-Hsien
2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 3643 - 3647
[44] Using Grayscale Images for Object Recognition with Convolutional-Recursive Neural Network
Hieu Minh Bui
Lech, Margaret
Cheng, Eva
Neville, Katrina
Burnett, Ian S.
2016 IEEE SIXTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (ICCE), 2016, : 321 - 325
[45] OBJECT BOUNDING BOX-CRITIC NETWORKS FOR OCCLUSION-ROBUST OBJECT DETECTION IN ROAD SCENE
Kim, Jung Uk
Kwon, Jungsu
Kim, Hak Gu
Lee, Haesung
Ro, Yong Man
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1313 - 1317
[46] Object Recognition Using Neural Networks for Robotics Precision Application
Celenta, Giampiero
Guida, Domenico
ADVANCES IN DESIGN, SIMULATION AND MANUFACTURING III: MANUFACTURING AND MATERIALS ENGINEERING, VOL 1, 2020, : 108 - 117
[47] Tomato leaf diseases recognition based on deep convolutional neural networks
Tian, Kai
Zeng, Jiefeng
Song, Tianci
Li, Zhuliu
Evans, Asenso
Li, Jiuhao
JOURNAL OF AGRICULTURAL ENGINEERING, 2023, 54 (01)
[48] Two-stage traffic sign detection and recognition based on SVM and convolutional neural networks
Hechri, Ahmed
Mtibaa, Abdellatif
IET IMAGE PROCESSING, 2020, 14 (05) : 939 - 946
[49] Recognition of radar active-jamming through convolutional neural networks
Wang, Yafeng
Sun, Boye
Wang, Ning
JOURNAL OF ENGINEERING-JOE, 2019, 2019 (21): : 7695 - 7697
[50] Domain adaptation for ear recognition using deep convolutional neural networks
Eyiokur, Fevziye Irem
Yaman, Dogucan
Ekenel, Hazim Kemal
IET BIOMETRICS, 2018, 7 (03) : 199 - 206

← 1 2 3 4 5 →