Distribution-balanced augmentation for rough data driven object detection

被引：0

作者：

Wang Z. ^{[1
]}

Tian L. ^{[1
]}

Du Q. ^{[1
]}

Sun Z. ^{[1
]}

An Y. ^{[1
]}

Liao W. ^{[2
]}

机构：

[1] School of Automation Science and Engineering, South China University of Technology, Wushan Street, Guangdong, Guangzhou

[2] Department of Telecommunications and Information Processing, Ghent University, St-Pietersnieuwstraat 41, Ghent

来源：

Multimedia Tools and Applications | 2024年 / 83卷 / 18期

关键词：

Data augmentation; Data imbalance; Instance segmentation; Object detection;

D O I：

10.1007/s11042-023-16589-y

中图分类号：

学科分类号：

摘要：

Recent advancements in deep learning have highlighted the pivotal role of data in training deep neural networks. However, the persistent issue of imbalanced data distribution poses a challenge for achieving optimal performance. Existing approaches like re-sampling, re-weighting, decoupling representation and data augmentation have sought to address data imbalance by increasing data quantity and diversity, improving model performance and generalization. Notably, Copy-Paste data augmentation has shown promise but comes with significant time and computing requirements. To overcome these limitations, we propose an innovative solution called balanced object paste (BOP). BOP enhances data distribution by pasting additional objects onto target images using a set of well-defined principles, synthesizing new images and yielding promising results. The position for pasting is generated through a region generation method based on existing object distribution. Additional objects are sampled in a class-balanced manner and scaled to achieve balance across categories and sizes. BOP outperforms existing methods like Simple Copy-Paste, showcasing notable improvements of 4.1 in AP for PASCAL VOC and a speed boost of 13 times. Moreover, BOP consistently enhances detector performance across various datasets and conditions, and its versatility extends to box-labeled datasets, establishing it as a valuable tool for object detection. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023.

引用

页码：56103 / 56125

页数：22

共 44 条

[1]

Sun C., Shrivastava A., Singh S., Gupta A., Revisiting unreasonable effectiveness of data in deep learning era, Proceedings of the IEEE International Conference on Computer Vision, pp. 843-852, (2017)

[2]

Zhang N., Donahue J., Girshick R., Darrell T., Part-based r-cnns for fine-grained category detection, European Conference on Computer Vision, pp. 834-849, (2014)

[3]

Dai J., He K., Sun J., Instance-aware semantic segmentation via multi-task network cascades, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3150-3158, (2016)

[4]

Hariharan B., Arbelaez P., Girshick R., Malik J., Simultaneous detection and segmentation, European Conference on Computer Vision, pp. 297-312, (2014)

[5]

Oksuz K., Cam B.C., Kalkan S., Akbas E., Imbalance problems in object detection: A review, IEEE Trans Pattern Anal Mach Intell, (2020)

[6]

Lin T.-Y., Maire M., Belongie S., Hays J., Perona P., Ramanan D., Dollar P., Zitnick C.L., Microsoft coco: Common objects in context, European Conference on Computer Vision, pp. 740-755, (2014)

[7]

Lin T.-Y., Goyal P., Girshick R., He K., Dollar P., Focal loss for dense object detection, Proceedings of the IEEE International Conference on Computer Vision, pp. 2980-2988, (2017)

[8]

Shrivastava A., Gupta A., Girshick R., Training region-based object detectors with online hard example mining, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 761-769, (2016)

[9]

Shorten C., Khoshgoftaar T.M., A survey on image data augmentation for deep learning, J Big Data, 6, 1, pp. 1-48, (2019)

[10]

Krizhevsky A., Sutskever I., Hinton G.E., Imagenet classification with deep convolutional neural networks, Adv Neural Inf Process Syst, 25, pp. 1097-1105, (2012)

← 1 2 3 4 5 →