Compositional Convolutional Neural Networks: A Robust and Interpretable Model for Object Recognition Under Occlusion

被引:51
作者
Kortylewski, Adam [1 ]
Liu, Qing [1 ]
Wang, Angtian [1 ]
Sun, Yihong [1 ]
Yuille, Alan [1 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
基金
瑞士国家科学基金会;
关键词
Compositional models; Robustness to partial occlusion; Image classification; Object detection; Out-of-distribution generalization;
D O I
10.1007/s11263-020-01401-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Computer vision systems in real-world applications need to be robust to partial occlusion while also being explainable. In this work, we show that black-box deep convolutional neural networks (DCNNs) have only limited robustness to partial occlusion. We overcome these limitations by unifying DCNNs with part-based models into Compositional Convolutional Neural Networks (CompositionalNets)-an interpretable deep architecture with innate robustness to partial occlusion. Specifically, we propose to replace the fully connected classification head of DCNNs with a differentiable compositional model that can be trained end-to-end. The structure of the compositional model enables CompositionalNets to decompose images into objects and context, as well as to further decompose object representations in terms of individual parts and the objects' pose. The generative nature of our compositional model enables it to localize occluders and to recognize objects based on their non-occluded parts. We conduct extensive experiments in terms of image classification and object detection on images of artificially occluded objects from the PASCAL3D+ and ImageNet dataset, and real images of partially occluded vehicles from the MS-COCO dataset. Our experiments show that CompositionalNets made from several popular DCNN backbones (VGG-16, ResNet50, ResNext) improve by a large margin over their non-compositional counterparts at classifying and detecting partially occluded objects. Furthermore, they can localize occluders accurately despite being trained with class-level supervision only. Finally, we demonstrate that CompositionalNets provide human interpretable predictions as their individual components can be understood as detecting parts and estimating an objects' viewpoint.
引用
收藏
页码:736 / 760
页数:25
相关论文
共 50 条
  • [41] Object Detection by a Super-Resolution Method and a Convolutional Neural Networks
    Na, Bokyoon
    Fox, Geoffrey C.
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 2263 - 2269
  • [42] Probabilistic Model of Object Detection Based on Convolutional Neural Network
    Li, Fang-Qi
    Ren, Xu-Die
    Guo, Hao-Nan
    COMMUNICATIONS, SIGNAL PROCESSING, AND SYSTEMS, 2019, 463 : 2059 - 2066
  • [43] ITERATIVE LOCALIZATION REFINEMENT IN CONVOLUTIONAL NEURAL NETWORKS FOR IMPROVED OBJECT DETECTION
    Cheng, Kai-Wen
    Chen, Yie-Tarng
    Fang, Wen-Hsien
    2016 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2016, : 3643 - 3647
  • [44] Using Grayscale Images for Object Recognition with Convolutional-Recursive Neural Network
    Hieu Minh Bui
    Lech, Margaret
    Cheng, Eva
    Neville, Katrina
    Burnett, Ian S.
    2016 IEEE SIXTH INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND ELECTRONICS (ICCE), 2016, : 321 - 325
  • [45] OBJECT BOUNDING BOX-CRITIC NETWORKS FOR OCCLUSION-ROBUST OBJECT DETECTION IN ROAD SCENE
    Kim, Jung Uk
    Kwon, Jungsu
    Kim, Hak Gu
    Lee, Haesung
    Ro, Yong Man
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 1313 - 1317
  • [46] Object Recognition Using Neural Networks for Robotics Precision Application
    Celenta, Giampiero
    Guida, Domenico
    ADVANCES IN DESIGN, SIMULATION AND MANUFACTURING III: MANUFACTURING AND MATERIALS ENGINEERING, VOL 1, 2020, : 108 - 117
  • [47] Tomato leaf diseases recognition based on deep convolutional neural networks
    Tian, Kai
    Zeng, Jiefeng
    Song, Tianci
    Li, Zhuliu
    Evans, Asenso
    Li, Jiuhao
    JOURNAL OF AGRICULTURAL ENGINEERING, 2023, 54 (01)
  • [48] Two-stage traffic sign detection and recognition based on SVM and convolutional neural networks
    Hechri, Ahmed
    Mtibaa, Abdellatif
    IET IMAGE PROCESSING, 2020, 14 (05) : 939 - 946
  • [49] Recognition of radar active-jamming through convolutional neural networks
    Wang, Yafeng
    Sun, Boye
    Wang, Ning
    JOURNAL OF ENGINEERING-JOE, 2019, 2019 (21): : 7695 - 7697
  • [50] Domain adaptation for ear recognition using deep convolutional neural networks
    Eyiokur, Fevziye Irem
    Yaman, Dogucan
    Ekenel, Hazim Kemal
    IET BIOMETRICS, 2018, 7 (03) : 199 - 206