Modeling Visual Context Is Key to Augmenting Object Detection Datasets

被引:152
作者
Dvornik, Nikita [1 ]
Mairal, Julien [1 ]
Schmid, Cordelia [1 ]
机构
[1] Univ Grenoble Alpes, INRIA, CNRS, Grenoble INP,Inst Engn, F-38000 Grenoble, France
来源
COMPUTER VISION - ECCV 2018, PT XII | 2018年 / 11216卷
关键词
Object detection; Data augmentation; Visual context;
D O I
10.1007/978-3-030-01258-8_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Performing data augmentation for learning deep neural networks is well known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves generalization. For object detection, classical approaches for data augmentation consist of generating images obtained by basic geometrical transformations and color changes of original training images. In this work, we go one step further and leverage segmentation annotations to increase the number of object instances present on training data. For this approach to be successful, we show that modeling appropriately the visual context surrounding objects is crucial to place them in the right environment. Otherwise, we show that the previous strategy actually hurts. With our context model, we achieve significant mean average precision improvements when few labeled examples are available on the VOC'12 benchmark.
引用
收藏
页码:375 / 391
页数:17
相关论文
共 39 条
[1]  
[Anonymous], 2010, INT J COMPUT VISION, DOI DOI 10.1007/s11263-009-0275-4
[2]  
[Anonymous], 2018, NEUROCOMPUTING, DOI DOI 10.1016/j.neucom.2017.09.048
[3]  
Barnea E., 2017, ARXIV171105471
[4]   Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks [J].
Bell, Sean ;
Zitnick, C. Lawrence ;
Bala, Kavita ;
Girshick, Ross .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2874-2883
[5]   Exploiting Hierarchical Context on a Large Database of Object Categories [J].
Choi, Myung Jin ;
Lim, Joseph J. ;
Torralba, Antonio ;
Willsky, Alan S. .
2010 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2010, :129-136
[6]  
Divvala SK, 2009, PROC CVPR IEEE, P1271, DOI 10.1109/CVPRW.2009.5206532
[7]   BlitzNet: A Real-Time Deep Network for Scene Understanding [J].
Dvornik, Nikita ;
Shmelkov, Konstantin ;
Mairal, Julien ;
Schmid, Cordelia .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4174-4182
[8]   Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection [J].
Dwibedi, Debidatta ;
Misra, Ishan ;
Hebert, Martial .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :1310-1319
[9]   Object Detection with Discriminatively Trained Part-Based Models [J].
Felzenszwalb, Pedro F. ;
Girshick, Ross B. ;
McAllester, David ;
Ramanan, Deva .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (09) :1627-1645
[10]  
Frid-Adar M, 2018, I S BIOMED IMAGING, P289, DOI 10.1109/ISBI.2018.8363576