VIENA2: A Driving Anticipation Dataset

被引：13

作者：

Aliakbarian, Mohammad Sadegh ^{[1
,2
,4
]}

Saleh, Fatemeh Sadat ^{[1
,4
]}

Salzmann, Mathieu ^{[3
]}

Fernando, Basura ^{[2
]}

Petersson, Lars ^{[1
,4
]}

Andersson, Lars ^{[4
]}

机构：

[1] Australian Natl Univ, Canberra, ACT, Australia

[2] ACRV, Canberra, ACT, Australia

[3] Ecole Polytech Fed Lausanne, CVLab, Lausanne, Switzerland

[4] CSIRO, Data61, Canberra, ACT, Australia

来源：

COMPUTER VISION - ACCV 2018, PT I | 2019年 / 11361卷

关键词：

D O I：

10.1007/978-3-030-20887-5_28

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Action anticipation is critical in scenarios where one needs to react before the action is finalized. This is, for instance, the case in automated driving, where a car needs to, e.g., avoid hitting pedestrians and respect traffic lights. While solutions have been proposed to tackle subsets of the driving anticipation tasks, by making use of diverse, task-specific sensors, there is no single dataset or framework that addresses them all in a consistent manner. In this paper, we therefore introduce a new, large-scale dataset, called VIENA2, covering 5 generic driving scenarios, with a total of 25 distinct action classes. It contains more than 15K full HD, 5 s long videos acquired in various driving conditions, weathers, daytimes and environments, complemented with a common and realistic set of sensor measurements. This amounts to more than 2.25M frames, each annotated with an action label, corresponding to 600 samples per action class. We discuss our data acquisition strategy and the statistics of our dataset, and benchmark state-of-the-art action anticipation techniques, including a new multi-modal LSTM architecture with an effective loss function for action anticipation in driving scenarios.

引用

页码：449 / 466

页数：18

共 45 条

[1]

Aliakbarian M.S., 2017, ICCV

[2]

[Anonymous], 2016, ICRA

[3]

[Anonymous], 2016, ACCV

[4]

BILEN H, 2016, PROC CVPR IEEE, P3034, DOI DOI 10.1109/CVPR.2016.331

[5]

Donahue J, 2015, PROC CVPR IEEE, P2625, DOI 10.1109/CVPR.2015.7298878

[6]

Dong C, 2017, IV

[7] Spatiotemporal Multiplier Networks for Video Action Recognition [J].

Feichtenhofer, Christoph ;

Pinz, Axel ;

Wildes, Richard P. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :7445-7454

[8] Convolutional Two-Stream Network Fusion for Video Action Recognition [J].

Feichtenhofer, Christoph ;

Pinz, Axel ;

Zisserman, Andrew .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1933-1941

[9]

Fernando Tharindu, 2017, CVPR

[10]

Ganin Yaroslav, 2015, INT C MACHINE LEARNI

← 1 2 3 4 5 →