InstaBoost plus plus : Visual Coherence Principles for Unified 2D/3D Instance Level Data Augmentation

被引：2

作者：

Sun, Jianhua ^{[1
]}

Fang, Hao-Shu ^{[1
]}

Li, Yuxuan ^{[1
]}

Wang, Runzhong ^{[1
]}

Gou, Minghao ^{[1
]}

Lu, Cewu ^{[1
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Dongchuan Rd, Shanghai 201100, Peoples R China

来源：

INTERNATIONAL JOURNAL OF COMPUTER VISION | 2023年 / 131卷 / 10期

关键词：

Data augmentation; Visual coherence; Object detection; Instance segmentation; 3D detection; OBJECT; SEARCH;

D O I：

10.1007/s11263-023-01807-9

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Instance-level perception tasks like object detection, instance segmentation, and 3D detection require many training samples to achieve satisfactory performance. The meticulous labels for these tasks are usually expensive to obtain and data augmentation is a natural choice to tackle such a problem. However, instance-level augmentation is less studied in previous research. In this paper, we present an effective, efficient and unified crop-paste mechanism to augment the training set utilizing existing instance-level annotations. Our design is derived from visual coherence and mines three inherent principles that widely exist in real-world data: (i) background coherence in local neighbor area, (ii) appearance coherence for instance placement, and (iii) instance coherence within the same category. Such methodologies are unified for various tasks including object detection, instance segmentation, and 3D detection. Extensive experiments demonstrate that our proposed approaches can successfully boost the performance of diverse frameworks on various datasets across multiple tasks, without modifying the network structure. Remarkable improvements are obtained: 5.1 mAP for object detection and 3.2 mAP for instance segmentation on COCO dataset, and 6.9 mAP for 3D detection on ScanNetV2 dataset. Our method can be easily integrated into different frameworks without affecting the training and inference efficiency.

引用

页码：2665 / 2681

页数：17

共 102 条

[1] [Anonymous], 2017, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI [DOI 10.1109/CVPR.2017.199, 10.1109/CVPR.2017.472, DOI 10.1109/CVPR.2017.472]
[2] Arnheim Rudolph., 1969, Visual Thinking
[3] What's the Point: Semantic Segmentation with Point Supervision
Bearman, Amy
Russakovsky, Olga
Ferrari, Vittorio
Fei-Fei, Li
[J]. COMPUTER VISION - ECCV 2016, PT VII, 2016, 9911 : 549 - 565
[4] Bertalmío M, 2001, PROC CVPR IEEE, P355
[5] Bleau A, 2000, COMPUT VIS IMAGE UND, V77, P317, DOI 10.1006/cviu.2000.0822
[6] YOLACT Real-time Instance Segmentation
Bolya, Daniel
Zhou, Chong
Xiao, Fanyi
Lee, Yong Jae
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9156 - 9165
[7] Carion N., 2020, EUR C COMP VIS, DOI [10.1007/978-3-030-58452-8, 10., DOI 10.1007/978-3-030-58452-813]
[8] Chen K, 2018, MMDetection
[9] MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
Chen, Liang-Chieh
Hermans, Alexander
Papandreou, George
Schroff, Florian
Wang, Peng
Adam, Hartwig
[J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4013 - 4022
[10] Chen T., 2021, INT C LEARNING REPRE

← 1 2 3 4 5 6 7 8 9 10 →