Weakly supervised graph learning for action recognition in untrimmed video

被引：3

作者：

Yao, Xiao ^{[1
]}

Zhang, Jia ^{[1
]}

Chen, Ruixuan ^{[1
]}

Zhang, Dan ^{[2
]}

Zeng, Yifeng ^{[1
]}

机构：

[1] Hohai Univ, Coll IoT Engn, Nanjing, Peoples R China

[2] Inner Mongolia Normal Univ, Coll Foreign Languages, Hohhot, Peoples R China

来源：

VISUAL COMPUTER | 2023年 / 39卷 / 11期

关键词：

Action recognition; Weakly supervised; Proposal relations; GCNs;

D O I：

10.1007/s00371-022-02673-1

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Action recognition in real-world scenarios is a challenging task which involves the action localization and classification for untrimmed video. Since the untrimmed video in real scenarios lacks fine annotation, existing supervised learning methods have limited effectiveness and robustness in performance. Moreover, state-of-the-art methods discuss each action proposal individually, ignoring the exploration of semantic relationship between different proposals from continuity of video. To address these issues, we propose a weakly supervised approach to explore the proposal relations using Graph Convolutional Networks (GCNs). Specifically, the method introduces action similarity edges and temporal similarity edges to represent the context semantic relationship between different proposals for graph constructing, and the similarity of action features is used to weakly supervise the spatial semantic relationship between labeled and unlabeled samples to achieve the effective recognition of actions in the video. We validate the effectiveness of the proposed method on public benchmarks for untrimmed video (THUMOS14 and ActivityNet). The experimental results demonstrate that the proposed method in this paper has achieved state-of-the-art results, and achieves better robustness and generalization performance.

引用

页码：5469 / 5483

页数：15

共 49 条

[11]

Ghanem B., 2018, ARXIV

[12]

Hamilton WL, 2017, ADV NEUR IN, V30

[13] Fast Temporal Activity Proposals for Efficient Detection of Human Actions in Untrimmed Videos [J].

Heilbron, Fabian Caba ;

Niebles, Juan Carlos ;

Ghanem, Bernard .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :1914-1923

[14] Relation Networks for Object Detection [J].

Hu, Han ;

Gu, Jiayuan ;

Zhang, Zheng ;

Dai, Jifeng ;

Wei, Yichen .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :3588-3597

[15]

Huang Wenbing, 2018, ADV NEURAL INFORM PR, V31

[16]

Jiang Y. G, 2014, THUMOS challenge: Action recognition with a large number of classes, P1

[17]

Kipf T. N., 2016, ARXIV

[18] ImageNet Classification with Deep Convolutional Neural Networks [J].

Krizhevsky, Alex ;

Sutskever, Ilya ;

Hinton, Geoffrey E. .

COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90

[19]

Lee P, 2020, AAAI CONF ARTIF INTE, V34, P11320

[20] BSN: Boundary Sensitive Network for Temporal Action Proposal Generation [J].

Lin, Tianwei ;

Zhao, Xu ;

Su, Haisheng ;

Wang, Chongjing ;

Yang, Ming .

COMPUTER VISION - ECCV 2018, PT IV, 2018, 11208 :3-21

← 1 2 3 4 5 →