iLab-20M: A large-scale controlled object dataset to investigate deep learning

被引：28

作者：

Borji, Ali ^{[1
]}

Izadi, Saeed ^{[2
]}

Itti, Laurent ^{[3
]}

机构：

[1] Univ Cent Florida, Ctr Comp Vis Res, Orlando, FL 32816 USA

[2] Amirkabir Univ Technol, Tehran, Iran

[3] Univ Southern Calif, Los Angeles, CA USA

来源：

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2016年

基金：

美国国家科学基金会;

关键词：

HIERARCHICAL-MODELS; RECOGNITION;

D O I：

10.1109/CVPR.2016.244

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Tolerance to image variations (e.g., translation, scale, pose, illumination, background) is an important desired property of any object recognition system, be it human or machine. Moving towards increasingly bigger datasets has been trending in computer vision especially with the emergence of highly popular deep learning models. While being very useful for learning invariance to object inter-and intra-class shape variability, these large-scale wild datasets are not very useful for learning invariance to other parameters urging researchers to resort to other tricks for training models. In this work, we introduce a large-scale synthetic dataset, which is freely and publicly available, and use it to answer several fundamental questions regarding selectivity and invariance properties of convolutional neural networks. Our dataset contains two parts: a) objects shot on a turntable: 15 categories, 8 rotation angles, 11 cameras on a semi-circular arch, 5 lighting conditions, 3 focus levels, variety of backgrounds (23.4 per instance) generating 1320 images per instance (about 22 million images in total), and b) scenes: in which a robotic arm takes pictures of objects on a 1: 160 scale scene. We study: 1) invariance and selectivity of different CNN layers, 2) knowledge transfer from one object category to another, 3) systematic or random sampling of images to build a train set, 4) domain adaptation from synthetic to natural scenes, and 5) order of knowledge delivery to CNNs. We also discuss how our analyses can lead the field to develop more efficient deep learning methods.

引用

页码：2221 / 2230

页数：10

共 70 条

[1]

[Anonymous], 2014, NIPS

[2]

[Anonymous], ARXIV150801983

[3]

[Anonymous], 2014, ADV NEURAL INFORM PR

[4]

[Anonymous], 2002, P AS C COMP VIS

[5]

[Anonymous], 2012, NIPS

[6]

[Anonymous], 2014, Computer Science

[7]

[Anonymous], CORR

[8]

[Anonymous], 2015, ARXIV PREPRINT ARXIV

[9]

[Anonymous], 2004, 2004 C COMP VIS PATT

[10]

[Anonymous], 2015, ARXIV151002927

← 1 2 3 4 5 6 7 →