Understanding human activity with uncertainty measure for novelty in graph convolutional networks

被引：0

作者：

Xing, Hao ^{[1
]}

Burschka, Darius ^{[1
]}

机构：

[1] Tech Univ Munich, Sch Computat Informat & Technol, Machine Vis & Percept Grp, Boltzmannstr 3, D-85748 Garching, Germany

来源：

INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH | 2024年

关键词：

Uncertainty quantification; human activity recognition; activity segmentation; human-object interaction;

D O I：

10.1177/02783649241287800

中图分类号：

TP24 [机器人技术];

学科分类号：

080202 ; 1405 ;

摘要：

Understanding human activity is a crucial aspect of developing intelligent robots, particularly in the domain of human-robot collaboration. Nevertheless, existing systems encounter challenges such as over-segmentation, attributed to errors in the up-sampling process of the decoder. In response, we introduce a promising solution: the Temporal Fusion Graph Convolutional Network. This innovative approach aims to rectify the inadequate boundary estimation of individual actions within an activity stream and mitigate the issue of over-segmentation in the temporal dimension. Moreover, systems leveraging human activity recognition frameworks for decision-making necessitate more than just the identification of actions. They require a confidence value indicative of the certainty regarding the correspondence between observations and training examples. This is crucial to prevent overly confident responses to unforeseen scenarios that were not part of the training data and may have resulted in mismatches due to weak similarity measures within the system. To address this, we propose the incorporation of a Spectral Normalized Residual connection aimed at enhancing efficient estimation of novelty in observations. This innovative approach ensures the preservation of input distance within the feature space by imposing constraints on the maximum gradients of weight updates. By limiting these gradients, we promote a more robust handling of novel situations, thereby mitigating the risks associated with overconfidence. Our methodology involves the use of a Gaussian process to quantify the distance in feature space. The final model is evaluated on two challenging public datasets in the field of human-object interaction recognition, that is, Bimanual Actions and IKEA Assembly datasets, and outperforms popular existing methods in terms of action recognition and segmentation accuracy as well as out-of-distribution detection.

引用

页数：17

共 39 条

[1] Behrmann J., 2019, INT C MACHINE LEARNI, P573
[2] The IKEA ASM Dataset: Understanding People Assembling Furniture through Actions, Objects and Pose
Ben-Shabat, Yizhak
Yu, Xin
Saleh, Fatemeh
Campbell, Dylan
Rodriguez-Opazo, Cristian
Li, Hongdong
Gould, Stephen
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 846 - 858
[3] Blundell C, 2015, PR MACH LEARN RES, V37, P1613
[4] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
Carreira, Joao
Zisserman, Andrew
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
[5] Channel-wise Topology Refinement Graph Convolution for Skeleton-Based Action Recognition
Chen, Yuxin
Zhang, Ziqi
Yuan, Chunfeng
Li, Bing
Deng, Ying
Hu, Weiming
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13339 - 13348
[6] Xception: Deep Learning with Depthwise Separable Convolutions
Chollet, Francois
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1800 - 1807
[7] Learning Object-Action Relations from Bimanual Human Demonstration Using Graph Networks
Dreher, Christian R. G.
Waechter, Mirko
Asfour, Tamim
[J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (01) : 187 - 194
[8] Convolutional Two-Stream Network Fusion for Video Action Recognition
Feichtenhofer, Christoph
Pinz, Axel
Zisserman, Andrew
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1933 - 1941
[9] Gal Y, 2016, PR MACH LEARN RES, V48
[10] Henaff M., 2015, Deep convolutional networks on graph-structured data

← 1 2 3 4 →