MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps

被引：46

作者：

Wu, Pengxiang ^{[1
]}

Chen, Siheng ^{[2
]}

Metaxas, Dimitris N. ^{[1
]}

机构：

[1] Rutgers State Univ, Newark, NJ 07102 USA

[2] Mitsubishi Elect Res Labs, Cambridge, MA USA

来源：

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020) | 2020年

关键词：

OCCUPANCY GRID PREDICTION;

D O I：

10.1109/CVPR42600.2020.01140

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The ability to reliably perceive the environmental states, particularly the existence of objects and their motion behavior, is crucial for autonomous driving. In this work, we propose an efficient deep model, called MotionNet, to jointly perform perception and motion prediction from 3D point clouds. MotionNet takes a sequence of LiDAR sweeps as input and outputs a bird's eye view (BEV) map, which encodes the object category and motion information in each grid cell. The backbone of MotionNet is a novel spatiotemporal pyramid network, which extracts deep spatial and temporal features in a hierarchical fashion. To enforce the smoothness of predictions over both space and time, the training of MotionNet is further regularized with novel spatial and temporal consistency losses. Extensive experiments show that the proposed method overall outperforms the state-of-the-arts, including the latest scene-flow- and 3D-object-detection-based methods. This indicates the potential value of the proposed method serving as a backup to the bounding-box-based system, and providing complementary information to the motion planner in autonomous driving. Code is available at https://www.merl.com/research/license#MotionNet.

引用

页码：11382 / 11392

页数：11

共 64 条

[1] Social LSTM: Human Trajectory Prediction in Crowded Spaces [J].

Alahi, Alexandre ;

Goel, Kratarth ;

Ramanathan, Vignesh ;

Robicquet, Alexandre ;

Li Fei-Fei ;

Savarese, Silvio .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :961-971

[2]

[Anonymous], P IEEE C COMP VIS PA

[3]

[Anonymous], 2017, P IEEE COMP VIS PATT

[4]

BESL PJ, 1992, P SOC PHOTO-OPT INS, V1611, P586, DOI 10.1117/12.57955

[5] nuScenes: A multimodal dataset for autonomous driving [J].

Caesar, Holger ;

Bankiti, Varun ;

Lang, Alex H. ;

Vora, Sourabh ;

Liong, Venice Erin ;

Xu, Qiang ;

Krishnan, Anush ;

Pan, Yu ;

Baldan, Giancarlo ;

Beijbom, Oscar .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628

[6]

Casas S, 2018, PR MACH LEARN RES, V87

[7]

Chen Siheng, 2020, IEEE SIGNAL PROCESS

[8] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[9]

Dai JF, 2016, ADV NEUR IN, V29

[10] Intermodal group-velocity engineering for broadband nonlinear optics [J].

Demas, Jeff ;

Rishoj, Lars ;

Liu, Xiao ;

Prabhakar, Gautam ;

Ramachandran, Siddharth .

PHOTONICS RESEARCH, 2019, 7 (01) :1-7

← 1 2 3 4 5 6 7 →