Occlusion-robust workflow recognition with context-aware compositional ConvNet

被引:1
作者
Zhang, Min [1 ,2 ]
Hu, Haiyang [2 ]
Li, Zhongjin [2 ]
Chen, Jie [2 ]
机构
[1] Zhejiang Ind Polytech Coll, Dept Design & Art, Shaoxing, Peoples R China
[2] Hangzhou Dianzi Univ, Sch Comp Sci & Technol, Hangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Workflow recognition; Occlusion detection; Deep learning; Bounding box voting;
D O I
10.1007/s00500-023-09225-2
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Workflow recognition relying on deep convolutional neural network has obtained promising performance. Though impressive results have been achieved on standard industrial workflow, the performance on heavily occluded workflow remains far from satisfactory. In this paper, we present an effective context-aware compositional ConvNet (CA-CompNet) for occluded workflow detection with the following contributions. First, we combine compositional model and original ConvNet together to build a unified deep architecture for occluded workflow detection, which has shown innate robustness to address the problem of object classification under occlusion. Second, in order to overcome the variable occlusion limitations, the bounding box annotations are utilized to segment the context from target workflow instance during training. Then, these segmentations are used to learn the proposed CA-CompNet, which enables the network to untangle the feature representation of workflow instance from the context. Third, a robust voting mechanism for candidate bounding box is introduced to improve the detection accuracy, which facilitates the model to precisely detect the bounding box of a specific workflow instance. Comprehensive experiments demonstrate that the proposed context-aware network can robustly detect workflow instance under occlusion in industrial environment, increasing the detection performance on MS COCO dataset by 4.6% (from 45.1 to 49.7%) in absolute performance compared to the advanced CenterNet.
引用
收藏
页码:5125 / 5135
页数:11
相关论文
共 38 条
  • [1] Arthur D, 2007, PROCEEDINGS OF THE EIGHTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P1027
  • [2] Banerjee A, 2005, J MACH LEARN RES, V6, P1345
  • [3] Correlational spectral clustering
    Blaschko, Matthew B.
    Lampert, Christoph H.
    [J]. 2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 93 - +
  • [4] Bochkovskiy Alexey, 2020, YOLOv4: Optimal speed and accuracy of object detection, DOI DOI 10.48550/ARXIV.2004.10934
  • [5] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [6] DeVries Terrance, 2017, Improved regularization of convolutional neural networks with cutout
  • [7] Fidler Sanja, 2014, ARXIV
  • [8] CONNECTIONISM AND COGNITIVE ARCHITECTURE - A CRITICAL ANALYSIS
    FODOR, JA
    PYLYSHYN, ZW
    [J]. COGNITION, 1988, 28 (1-2) : 3 - 71
  • [9] A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs
    George, Dileep
    Lehrach, Wolfgang
    Kansky, Ken
    Lazaro-Gredilla, Miguel
    Laan, Christopher
    Marthi, Bhaskara
    Lou, Xinghua
    Meng, Zhaoshi
    Liu, Yi
    Wang, Huayan
    Lavin, Alex
    Phoenix, D. Scott
    [J]. SCIENCE, 2017, 358 (6368)
  • [10] Rich feature hierarchies for accurate object detection and semantic segmentation
    Girshick, Ross
    Donahue, Jeff
    Darrell, Trevor
    Malik, Jitendra
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 580 - 587