Large Scale Visual Recognition through Adaptation using Joint Representation and Multiple Instance Learning

被引:0
作者
Hoffman, Judy [1 ]
Pathak, Deepak [1 ]
Tzeng, Eric [1 ]
Long, Jonathan [1 ]
Guadarrama, Sergio [1 ,3 ]
Darrell, Trevor [1 ]
Saenko, Kate [2 ]
机构
[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[2] Univ Massachusetts, Dept Comp Sci, Lowell, MA 01854 USA
[3] Google Res, Mountain View, CA USA
关键词
Computer Vision; Deep Learning; Transfer Learning; Large Scale Learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A major barrier towards scaling visual recognition systems is the difficulty of obtaining labeled images for large numbers of categories. Recently, deep convolutional neural networks (CNNs) trained used 1.2M+ labeled images have emerged as clear winners on object classification benchmarks. Unfortunately, only a small fraction of those labels are available with bounding box localization for training the detection task and even fewer pixel level annotations are available for semantic segmentation. It is much cheaper and easier to collect large quantities of image-level labels from search engines than it is to collect scene-centric images with precisely localized labels. We develop methods for learning large scale recognition models which exploit joint training over both weak (image-level) and strong (bounding box) labels and which transfer learned perceptual representations from strongly-labeled auxiliary tasks. We provide a novel formulation of a joint multiple instance learning method that includes examples from object-centric data with image-level labels when available, and also performs domain transfer learning to improve the underlying detector representation. We then show how to use our large scale detectors to produce pixel level annotations. Using our method, we produce a >7.6K category detector and release code and models at lsda.berkeleyvision.org.
引用
收藏
页数:31
相关论文
共 70 条
[1]  
Andrews Stuart, 2002, Proceedings of the 15th International Conference on Neural Information Processing Systems. NIPS'02, P561
[2]  
[Anonymous], P CVPR
[3]  
[Anonymous], BRIT MACH VIS C
[4]  
[Anonymous], 2014, EUR C COMP VIS ECCV
[5]  
[Anonymous], ARTIFICIAL INTELLIGE
[6]  
[Anonymous], 2004, IJCV
[7]  
[Anonymous], P CVPR
[8]  
[Anonymous], 2012, ECCV
[9]  
[Anonymous], 2014, P IEEE C COMPUTER VI
[10]  
[Anonymous], 2015, CVPR