BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

被引：1396

作者：

Yu, Fisher ^{[1
]}

Chen, Haofeng ^{[1
]}

Wang, Xin ^{[1
]}

Xian, Wenqi ^{[1
,2
]}

Chen, Yingying ^{[1
]}

Liu, Fangchen ^{[1
,3
]}

Madhavan, Vashisht ^{[1
,4
]}

Darrell, Trevor ^{[1
]}

机构：

[1] Univ Calif Berkeley, Berkeley, CA 94720 USA

[2] Cornell Univ, Ithaca, NY 14853 USA

[3] Univ Calif San Diego, San Diego, CA USA

[4] Element Inc, New York, NY USA

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2020年

关键词：

D O I：

10.1109/CVPR42600.2020.00271

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Datasets drive vision progress, yet existing driving datasets are impoverished in terms of visual content and supported tasks to study multitask learning for autonomous driving. Researchers are usually constrained to study a small set of problems on one dataset, while real-world computer vision applications require performing tasks of various complexities. We construct BDDIOOK-1, the largest driving video dataset with 100K videos and 10 tasks to evaluate the exciting progress of image recognition algorithms on autonomous driving. The dataset possesses geographic, environmental, and weather diversity, which is useful for training models that are less likely to be surprised by new conditions. Based on this diverse dataset, we build a benchmark for heterogeneous multitask learning and study how to solve the tasks together. Our experiments show that special training strategies are needed for existing models to perform such heterogeneous tasks. BDDIOOK opens the door for future studies in this important venue.

引用

页码：2633 / 2642

页数：10

共 39 条

[1]

Abu-El-Haija S., 2016, ARXIV160908675

[2] Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN plus [J].

Acuna, David ;

Ling, Huan ;

Kar, Amlan ;

Fidler, Sanja .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :859-868

[3] Real time Detection of Lane Markers in Urban Streets [J].

Aly, Mohamed .

2008 IEEE INTELLIGENT VEHICLES SYMPOSIUM, VOLS 1-3, 2008, :165-170

[4]

[Anonymous], UCF101: A Dataset of 101 Human Actions Classes from Videos in the Wild

[5]

Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698

[6] Multitask learning [J].

Caruana, R .

MACHINE LEARNING, 1997, 28 (01) :41-75

[7] Beyond triplet loss: a deep quadruplet network for person re-identification [J].

Chen, Weihua ;

Chen, Xiaotang ;

Zhang, Jianguo ;

Huang, Kaiqi .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1320-1329

[8] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[9] ImageNet: A Large-Scale Hierarchical Image Database [J].

Deng, Jia ;

Dong, Wei ;

Socher, Richard ;

Li, Li-Jia ;

Li, Kai ;

Li Fei-Fei .

CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, :248-255

[10] Structured Forests for Fast Edge Detection [J].

Dollar, Piotr ;

Zitnick, C. Lawrence .

2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :1841-1848

← 1 2 3 4 →