InstaBoost plus plus : Visual Coherence Principles for Unified 2D/3D Instance Level Data Augmentation

被引:2
作者
Sun, Jianhua [1 ]
Fang, Hao-Shu [1 ]
Li, Yuxuan [1 ]
Wang, Runzhong [1 ]
Gou, Minghao [1 ]
Lu, Cewu [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Dongchuan Rd, Shanghai 201100, Peoples R China
关键词
Data augmentation; Visual coherence; Object detection; Instance segmentation; 3D detection; OBJECT; SEARCH;
D O I
10.1007/s11263-023-01807-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Instance-level perception tasks like object detection, instance segmentation, and 3D detection require many training samples to achieve satisfactory performance. The meticulous labels for these tasks are usually expensive to obtain and data augmentation is a natural choice to tackle such a problem. However, instance-level augmentation is less studied in previous research. In this paper, we present an effective, efficient and unified crop-paste mechanism to augment the training set utilizing existing instance-level annotations. Our design is derived from visual coherence and mines three inherent principles that widely exist in real-world data: (i) background coherence in local neighbor area, (ii) appearance coherence for instance placement, and (iii) instance coherence within the same category. Such methodologies are unified for various tasks including object detection, instance segmentation, and 3D detection. Extensive experiments demonstrate that our proposed approaches can successfully boost the performance of diverse frameworks on various datasets across multiple tasks, without modifying the network structure. Remarkable improvements are obtained: 5.1 mAP for object detection and 3.2 mAP for instance segmentation on COCO dataset, and 6.9 mAP for 3D detection on ScanNetV2 dataset. Our method can be easily integrated into different frameworks without affecting the training and inference efficiency.
引用
收藏
页码:2665 / 2681
页数:17
相关论文
共 102 条
  • [1] [Anonymous], 2017, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI [DOI 10.1109/CVPR.2017.199, 10.1109/CVPR.2017.472, DOI 10.1109/CVPR.2017.472]
  • [2] Arnheim Rudolph., 1969, Visual Thinking
  • [3] What's the Point: Semantic Segmentation with Point Supervision
    Bearman, Amy
    Russakovsky, Olga
    Ferrari, Vittorio
    Fei-Fei, Li
    [J]. COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 : 549 - 565
  • [4] Bertalmío M, 2001, PROC CVPR IEEE, P355
  • [5] Bleau A, 2000, COMPUT VIS IMAGE UND, V77, P317, DOI 10.1006/cviu.2000.0822
  • [6] YOLACT Real-time Instance Segmentation
    Bolya, Daniel
    Zhou, Chong
    Xiao, Fanyi
    Lee, Yong Jae
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9156 - 9165
  • [7] Carion N., 2020, EUR C COMP VIS, DOI [10.1007/978-3-030-58452-8, 10., DOI 10.1007/978-3-030-58452-813]
  • [8] Chen K, 2018, MMDetection
  • [9] MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
    Chen, Liang-Chieh
    Hermans, Alexander
    Papandreou, George
    Schroff, Florian
    Wang, Peng
    Adam, Hartwig
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4013 - 4022
  • [10] Chen T., 2021, INT C LEARNING REPRE