Compositional Convolutional Neural Networks: A Deep Architecture with Innate Robustness to Partial Occlusion

被引:66
作者
Kortylewski, Adam [1 ]
He, Ju [1 ]
Liu, Qing [1 ]
Yuille, Alan [1 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年
基金
瑞士国家科学基金会;
关键词
SHAPE;
D O I
10.1109/CVPR42600.2020.00896
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent findings show that deep convolutional neural networks (DCNNs) do not generalize well under partial occlusion. Inspired by the success of compositional models at classifying partially occluded objects, we propose to integrate compositional models and DCNNs into a unified deep model with innate robustness to partial occlusion. We term this architecture Compositional Convolutional Neural Network. In particular, we propose to replace the fully connected classification head of a DCNN with a differentiable compositional model. The generative nature of the compositional model enables it to localize occluders and subsequently focus on the non-occluded parts of the object. We conduct classification experiments on artificially occluded images as well as real images of partially occluded objects from the MS-COCO dataset. The results show that DCNNs do not classify occluded objects robustly, even when trained with data that is strongly augmented with partial occlusions. Our proposed model outperforms standard DCNNs by a large margin at classifying partially occluded objects, even when it has not been exposed to occluded objects during training. Additional experiments demonstrate that CompositionalNets can also localize the occluders accurately, despite being trained with class labels only. The code used in this work is publicly available (1).
引用
收藏
页码:8937 / 8946
页数:10
相关论文
共 39 条
[1]  
Banerjee A, 2005, J MACH LEARN RES, V6, P1345
[2]  
Bienenstock E, 1997, ADV NEUR IN, V9, P838
[3]  
Bienenstock E, 1998, HDB BRAIN THEORY NEU, P223
[4]   Unsupervised Learning of Dictionaries of Hierarchical Compositional Models [J].
Dai, Jifeng ;
Hong, Yi ;
Hu, Wenze ;
Zhu, Song-Chun ;
Wu, Ying Nian .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2505-2512
[5]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[6]  
DeVries T., 2017, IMPROVED REGULARIZAT
[7]  
Fawzi Alhussein, 2016, MEASURING EFFECT NUI
[8]  
Fidler S, 2014, ARXIV
[9]   CONNECTIONISM AND COGNITIVE ARCHITECTURE - A CRITICAL ANALYSIS [J].
FODOR, JA ;
PYLYSHYN, ZW .
COGNITION, 1988, 28 (1-2) :3-71
[10]   A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs [J].
George, Dileep ;
Lehrach, Wolfgang ;
Kansky, Ken ;
Lazaro-Gredilla, Miguel ;
Laan, Christopher ;
Marthi, Bhaskara ;
Lou, Xinghua ;
Meng, Zhaoshi ;
Liu, Yi ;
Wang, Huayan ;
Lavin, Alex ;
Phoenix, D. Scott .
SCIENCE, 2017, 358 (6368)