In-and-Out: a data augmentation technique for computer vision tasks

被引:2
作者
Li, Chenghao [1 ]
Zhang, Jing [1 ,2 ]
Hu, Li [1 ]
Zhao, Hao [1 ,2 ]
Zhu, Huilong [1 ]
Shan, Maomao [1 ]
机构
[1] Southwest Univ Sci & Technol, Sch Informat Engn, Mianyang, Sichuan, Peoples R China
[2] Univ Sci & Technol China, Sch Informat Sci & Technol, Hefei, Peoples R China
关键词
data augmentation; information variance; dynamic local operation; information removal; robustness;
D O I
10.1117/1.JEI.31.1.013023
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This study focuses on the over-fitting problem in the training process of the deep convolutional neural network model and the problem of poor robustness when the model is applied in an occlusion environment. We propose a unique data augmentation method, In-and-Out. First, the information variance is enhanced through dynamic local operation while maintaining the overall geometric structure of the training image; compared with the global data augmentation method, our method effectively alleviates the overfitting problem of model training and significantly improves the generalization ability of the model. Then through the dynamic information removal operation, the image is hidden according to the dynamic patch generated by multiple parameters. Compared with other information removal methods, our method can better simulate the real-world occlusion environment, thus improving the robustness of the model in various occlusion scenes. This method is simple and easy to implement and can be integrated with most CNN-based computer vision tasks. Our extensive experiments show that our method surpasses previous methods on the Canadian Institute for Advanced Research dataset for image classification, the PASCAL Visual Object Classes dataset for object detection, and the Cityscapes dataset for semantic segmentation. In addition, our robustness experiments show that our method has good robustness to occlusion in various scenes. (C) 2022 SPIE and IS&T
引用
收藏
页数:19
相关论文
共 60 条
  • [1] [Anonymous], 2017, Advances in Neural Information Processing Systems
  • [2] Antoniou A., 2018, INT C LEARN REPR ICL, P431
  • [3] Ba J., 2013, Advances in neural information processing systems, P3084
  • [4] Bochkovskiy A., 2020, ARXIV 200410934
  • [5] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [7] Chen P., 2020, ARXIV200104086
  • [8] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [9] Randaugment: Practical automated data augmentation with a reduced search space
    Cubuk, Ekin D.
    Zoph, Barret
    Shlens, Jonathon
    Le, Quoc, V
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 3008 - 3017
  • [10] AutoAugment: Learning Augmentation Strategies from Data
    Cubuk, Ekin D.
    Zoph, Barret
    Mane, Dandelion
    Vasudevan, Vijay
    Le, Quoc V.
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 113 - 123