In-and-Out: a data augmentation technique for computer vision tasks

被引：2

作者：

Li, Chenghao ^{[1
]}

Zhang, Jing ^{[1
,2
]}

Hu, Li ^{[1
]}

Zhao, Hao ^{[1
,2
]}

Zhu, Huilong ^{[1
]}

Shan, Maomao ^{[1
]}

机构：

[1] Southwest Univ Sci & Technol, Sch Informat Engn, Mianyang, Sichuan, Peoples R China

[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2022年 / 31卷 / 01期

关键词：

data augmentation; information variance; dynamic local operation; information removal; robustness;

D O I：

10.1117/1.JEI.31.1.013023

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This study focuses on the over-fitting problem in the training process of the deep convolutional neural network model and the problem of poor robustness when the model is applied in an occlusion environment. We propose a unique data augmentation method, In-and-Out. First, the information variance is enhanced through dynamic local operation while maintaining the overall geometric structure of the training image; compared with the global data augmentation method, our method effectively alleviates the overfitting problem of model training and significantly improves the generalization ability of the model. Then through the dynamic information removal operation, the image is hidden according to the dynamic patch generated by multiple parameters. Compared with other information removal methods, our method can better simulate the real-world occlusion environment, thus improving the robustness of the model in various occlusion scenes. This method is simple and easy to implement and can be integrated with most CNN-based computer vision tasks. Our extensive experiments show that our method surpasses previous methods on the Canadian Institute for Advanced Research dataset for image classification, the PASCAL Visual Object Classes dataset for object detection, and the Cityscapes dataset for semantic segmentation. In addition, our robustness experiments show that our method has good robustness to occlusion in various scenes. (C) 2022 SPIE and IS&T

引用

页数：19

共 60 条

[1] [Anonymous], 2017, Advances in Neural Information Processing Systems
[2] Antoniou A., 2018, INT C LEARN REPR ICL, P431
[3] Ba J., 2013, Advances in neural information processing systems, P3084
[4] Bochkovskiy A., 2020, ARXIV 200410934
[5] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
Chen, Liang-Chieh
Zhu, Yukun
Papandreou, George
Schroff, Florian
Adam, Hartwig
[J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[7] Chen P., 2020, ARXIV200104086
[8] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[9] Randaugment: Practical automated data augmentation with a reduced search space
Cubuk, Ekin D.
Zoph, Barret
Shlens, Jonathon
Le, Quoc, V
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3008 - 3017
[10] AutoAugment: Learning Augmentation Strategies from Data
Cubuk, Ekin D.
Zoph, Barret
Mane, Dandelion
Vasudevan, Vijay
Le, Quoc V.
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 113 - 123

← 1 2 3 4 5 6 →