Automatic Modelling for Interactive Action Assessment

被引:9
作者
Gao, Jibin [1 ]
Pan, Jia-Hui [1 ]
Zhang, Shao-Jie [1 ]
Zheng, Wei-Shi [1 ,2 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] Peng Cheng Lab, Shenzhen 518055, Peoples R China
关键词
Action assessment; Interactive action; Video understanding; SURGICAL SKILLS; COEFFICIENT; VALUES; VIDEO;
D O I
10.1007/s11263-022-01695-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action assessment, the task of visually assessing the quality of performing an action, has attracted much attention in recent years, with promising applications in areas such as medical treatment and sporting events. However, most existing methods of action assessment mainly target the actions performed by a single person; in particular, they neglect the asymmetric relations among agents (e.g., between persons and objects), limiting their performance in many nonindividual actions. In this work, we formulate a framework for modelling asymmetric interactions among agents for action assessment, considering the subordinations among agents in many interactive actions. Specifically, we propose an asymmetric interaction learner consisting of an automatic assigner and an asymmetric interaction network search module. The automatic assigner is designed to automatically group agents within an action into a primary agent (e.g., human) and secondary agents (e.g., objects); the asymmetric interaction network search module adaptively learns the asymmetric interactions between these agents. We conduct experiments on the JIGSAWS dataset containing surgical actions and additionally collect two new datasets, TASD-2 and PaSk, for action assessment on interactive sporting actions. The experimental results on these three datasets demonstrate the effectiveness of our framework in achieving state-of-the-art performance. The extensive experiments on the AQA-7 dataset also indicate the robustness of our model in conventional action assessment settings.
引用
收藏
页码:659 / 679
页数:21
相关论文
共 54 条
[41]  
Xie S., 2018, ICLR
[42]  
Xu C., 2018, ARXIV
[43]   Participation-Contributed Temporal Dynamic Model for Group Activity Recognition [J].
Yan, Rui ;
Tang, Jinhui ;
Shu, Xiangbo ;
Li, Zechao ;
Tian, Qi .
PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, :1292-1300
[44]  
Yan SJ, 2018, AAAI CONF ARTIF INTE, P7444
[45]  
YAO T, 2016, PROC CVPR IEEE, P982, DOI DOI 10.1109/CVPR.2016.112
[46]   Hybrid Dynamic-static Context-aware Attention Network for Action Assessment in Long Videos [J].
Zeng, Ling-An ;
Hong, Fa-Ting ;
Zheng, Wei-Shi ;
Yu, Qi-Zhi ;
Zeng, Wei ;
Wang, Yao-Wei ;
Lai, Jian-Huang .
MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, :2526-2534
[47]   Fast Collective Activity Recognition Under Weak Supervision [J].
Zhang, Peizhen ;
Tang, Yongyi ;
Hu, Jian-Fang ;
Zheng, Wei-Shi .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :29-43
[48]   Relative Hidden Markov Models for Video-Based Evaluation of Motion Skills in Surgical Training [J].
Zhang, Qiang ;
Li, Baoxin .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2015, 37 (06) :1206-1218
[49]  
Zhang Qiang., 2011, Proceedings of the 2011 International ACM Workshop on Medical Multimedia Analysis and Retrieval, MMAR '11, P19, DOI 10.1145/2072545.2072550
[50]  
Zhang Yan, 2020, CoRR abs/2002.08681