Exploring the Sim2Real Gap using Digital Twins

被引：1

作者：

Sudhakar, Sruthi ^{[1
]}

Hanzelka, Jon ^{[2
]}

Bobillot, Josh ^{[2
]}

Randhavane, Tanmay ^{[2
]}

Joshi, Neel ^{[2
]}

Vineet, Vibhav ^{[2
]}

机构：

[1] Columbia Univ, New York, NY 10027 USA

[2] Microsoft Res, Redmond, WA USA

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年

关键词：

D O I：

10.1109/ICCV51070.2023.01867

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

It is very time consuming to create datasets for training computer vision models. An emerging alternative is to use synthetic data, but if the synthetic data is not similar enough to the real data, the performance is typically below that of training with real data. Thus using synthetic data still requires a large amount of time, money, and skill as one needs to author the data carefully. In this paper, we seek to understand which aspects of this authoring process are most critical. We present an analysis of which factors of variation between simulated and real data are most important. We capture images of YCB objects to create a novel YCB-Real dataset. We then create a novel synthetic "digital twin" dataset, YCB-Synthetic, which matches the YCB-Real dataset and includes variety of artifacts added to the synthetic data. We study the affects of these artifacts on our dataset and two existing published datasets on two different computer vision tasks: object detection and instance segmentation. We provide an analysis of the cost-benefit trade-offs between artist time for fixing artifacts and trained model accuracy. We plan to release this dataset (images and 3D assets) so they can be further used by the community. Link to dataset(1)

引用

页码：20361 / 20370

页数：10

共 34 条

[1] [Anonymous], 2017, NIPS
[2] CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training
Bao, Jianmin
Chen, Dong
Wen, Fang
Li, Houqiang
Hua, Gang
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2764 - 2773
[3] Beery S, 2020, IEEE WINT CONF APPL, P852, DOI [10.1109/wacv45572.2020.9093570, 10.1109/WACV45572.2020.9093570]
[4] Chang A X, 2015, COMPUTER SCI, V1512, P3
[5] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[6] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[7] Dosovitskiy A., 2017, P 1 ANN C ROB LEARN, P1, DOI DOI 10.48550/ARXIV.1711.03938
[8] Du Zhekai, 2021, IEEE C COMP VIS PATT
[9] Virtual Worlds as Proxy for Multi-Object Tracking Analysis
Gaidon, Adrien
Wang, Qiao
Cabon, Yohann
Vig, Eleonora
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 4340 - 4349
[10] Ganin Y, 2016, J MACH LEARN RES, V17

← 1 2 3 4 →