Action anticipation for collaborative environments: The impact of contextual information and uncertainty-based prediction

被引：5

作者：

Canuto, Clebeson ^{[1
]}

Moreno, Plinio ^{[2
]}

Samatelo, Jorge ^{[1
]}

Vassallo, Raquel ^{[1
]}

Santos-Victor, Jose ^{[2
]}

机构：

[1] Univ Fed Espirito Santo, Dept Elect Engn, Room 20,CT 2,Av Fernando Ferrari 514, BR-29075910 Vitoria, ES, Brazil

[2] Univ Lisbon, Inst Syst & Robot, Inst Super Tecn, Floor 7,North Tower,Av Rovisco Pais 1, P-1049001 Lisbon, Portugal

来源：

NEUROCOMPUTING | 2021年 / 444卷

关键词：

Action anticipation; Early action prediction; Context information; Bayesian deep learning; Uncertainty; ACTION RECOGNITION;

D O I：

10.1016/j.neucom.2020.07.135

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

To interact with humans in collaborative environments, machines need to be able to predict (i.e., anticipate) future events, and execute actions in a timely manner. However, the observation of the human limb movements may not be sufficient to anticipate their actions unambiguously. In this work, we consider two additional sources of information (i.e., context) over time, gaze, movement and object information, and study how these additional contextual cues improve the action anticipation performance. We address action anticipation as a classification task, where the model takes the available information as the input and predicts the most likely action. We propose to use the uncertainty about each prediction as an online decision-making criterion for action anticipation. Uncertainty is modeled as a stochastic process applied to a time-based neural network architecture, which improves the conventional class-likelihood (i.e., deterministic) criterion. The main contributions of this paper are fourfold: (i) We propose a novel and effective decision-making criterion that can be used to anticipate actions even in situations of high ambiguity; (ii) we propose a deep architecture that outperforms previous results in the action anticipation task when using the Acticipate collaborative dataset; (iii) we show that contextual information is important to disambiguate the interpretation of similar actions; and (iv) we also provide a formal description of three existing performance metrics that can be easily used to evaluate action anticipation models. Our results on the Acticipate dataset showed the importance of contextual information and the uncertainty criterion for action anticipation. We achieve an average accuracy of 98:75% in the anticipation task using only an average of 25% of observations. Also, considering that a good anticipation model should perform well in the action recognition task, we achieve an average accuracy of 100% in action recognition on the Acticipate dataset, when the entire observation set is used. (C) 2020 Elsevier B.V. All rights reserved.

引用

页码：301 / 318

页数：18

共 51 条

[31] Pupil: An Open Source Platform for Pervasive Eye Tracking and Mobile Gaze-based Interaction [J].