Unsupervised Hierarchical Dynamic Parsing and Encoding for Action Recognition

被引：16

作者：

Su, Bing ^{[1
]}

Zhou, Jiahuan ^{[2
]}

Ding, Xiaoqing ^{[3
]}

Wu, Ying ^{[2
]}

机构：

[1] Chinese Acad Sci, Inst Software, Sci & Technol Integrated Informat Syst Lab, Beijing 100190, Peoples R China

[2] Northwestern Univ, Dept Elect Engn & Comp Sci, Evanston, IL 60208 USA

[3] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing 100084, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2017年 / 26卷 / 12期

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

Action recognition; temporal clustering; hierarchical modeling; dynamic encoding; ENSEMBLE; VECTOR; MODELS; PARTS;

D O I：

10.1109/TIP.2017.2745212

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Generally, the evolution of an action is not uniform across the video, but exhibits quite complex rhythms and non-stationary dynamics. To model such non-uniform temporal dynamics, in this paper, we describe a novel hierarchical dynamic parsing and encoding method to capture both the locally smooth dynamics and globally drastic dynamic changes. It parses the dynamics of an action into different layers and encodes such multi-layer temporal information into a joint representation for action recognition. At the first layer, the action sequence is parsed in an unsupervised manner into several smooth-changing stages corresponding to different key poses or temporal structures by temporal clustering. The dynamics within each stage are encoded by mean-pooling or rank-pooling. At the second layer, the temporal information of the ordered dynamics extracted from the previous layer is encoded again by rank-pooling to form the overall representation. Extensive experiments on a gesture action data set (Chalearn Gesture) and three generic action data sets (Olympic Sports, Hollywood2, and UCF101) have demonstrated the effectiveness of the proposed method.

引用

页码：5784 / 5799

页数：16

共 50 条

[41] Unsupervised open-world human action recognition
Gutoski, Matheus
Lazzaretti, Andre Eugenio
Lopes, Heitor Silverio
PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (04) : 1753 - 1770
[42] Unsupervised motion representation enhanced network for action recognition
Yang, Xiaohang
Kong, Lingtong
Yang, Jie
arXiv, 2021,
[43] Action Recognition by Hierarchical Mid-level Action Elements
Lan, Tian
Zhu, Yuke
Zamir, Amir Roshan
Savarese, Silvio
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4552 - 4560
[44] Robust Action Recognition Based on a Hierarchical Model
Jiang, Xinbo
Zhong, Fan
Peng, Qunsheng
Qin, Xueying
2013 INTERNATIONAL CONFERENCE ON CYBERWORLDS (CW), 2013, : 191 - 198
[45] Learning hierarchical video representation for action recognition
Li Q.
Qiu Z.
Yao T.
Mei T.
Rui Y.
Luo J.
International Journal of Multimedia Information Retrieval, 2017, 6 (1) : 85 - 98
[46] A Hierarchical Learning Approach for Human Action Recognition
Lemieux, Nicolas
Noumeir, Rita
SENSORS, 2020, 20 (17) : 1 - 16
[47] A novel hierarchical framework for human action recognition
Chen, Hongzhao
Wang, Guijin
Xue, Jing-Hao
He, Li
PATTERN RECOGNITION, 2016, 55 : 148 - 159
[48] Constructing Hierarchical Spatiotemporal Information for Action Recognition
Yao, Guangle
Zhong, Jiandan
Lei, Tao
Liu, Xianyuan
2018 IEEE SMARTWORLD, UBIQUITOUS INTELLIGENCE & COMPUTING, ADVANCED & TRUSTED COMPUTING, SCALABLE COMPUTING & COMMUNICATIONS, CLOUD & BIG DATA COMPUTING, INTERNET OF PEOPLE AND SMART CITY INNOVATION (SMARTWORLD/SCALCOM/UIC/ATC/CBDCOM/IOP/SCI), 2018, : 596 - 602
[49] Hierarchical Posture Representation for Robust Action Recognition
Chen, Yi
Yu, Li
Ota, Kaoru
Dong, Mianxiong
IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS, 2019, 6 (05) : 1115 - 1125
[50] Football Action Recognition using Hierarchical LSTM
Tsunoda, Takamasa
Komori, Yasuhiro
Matsugu, Masakazu
Harada, Tatsuya
2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 155 - 163

← 1 2 3 4 5 →