MULTI-STAGE CONTRASTIVE REGRESSION FOR ACTION QUALITY ASSESSMENT

被引:3
作者
An, Qi [1 ]
Qi, Mengshi [1 ]
Ma, Huadong [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing Key Lab Intelligent Telecommun Software &, Beijing, Peoples R China
来源
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024 | 2024年
关键词
Action Quality Assessment; Contrastive Regression; Multi-stage Segmentation;
D O I
10.1109/ICASSP48485.2024.10447069
中图分类号
学科分类号
摘要
In recent years, there has been growing interest in the video-based action quality assessment (AQA). Most existing methods typically solve AQA problem by considering the entire video yet overlooking the inherent stage-level characteristics of actions. To address this issue, we design a novel Multi-stage Contrastive Regression (MCoRe) framework for the AQA task. This approach allows us to efficiently extract spatial-temporal information, while simultaneously reducing computational costs by segmenting the input video into multiple stages or procedures. Inspired by the graph contrastive learning, we propose a new stage-wise contrastive learning loss function to enhance performance. As a result, MCoRe demonstrates the state-of-the-art result so far on the widely-adopted fine-grained AQA dataset. Our source code is available at https://github.com/Angel-1999/MCoRe.
引用
收藏
页码:4110 / 4114
页数:5
相关论文
共 19 条
  • [1] Am I a Baller? Basketball Performance Assessment from First-Person Videos
    Bertasius, Gedas
    Park, Hyun Soo
    Yu, Stella X.
    Shi, Jianbo
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2196 - 2204
  • [2] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset
    Carreira, Joao
    Zisserman, Andrew
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4724 - 4733
  • [3] Cho K., 2014, ARXIV14061078, V1406, P1078, DOI DOI 10.3115/V1/D14-1179
  • [4] Pairwise Contrastive Learning Network for Action Quality Assessment
    Li, Mingzhe
    Zhang, Hong-Bo
    Lei, Qing
    Fan, Zongwen
    Liu, Jinghua
    Du, Ji-Xiang
    [J]. COMPUTER VISION - ECCV 2022, PT IV, 2022, 13664 : 457 - 473
  • [5] What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment
    Parmar, Paritosh
    Morris, Brendan Tran
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 304 - 313
  • [6] Semantics-Aware Spatial-Temporal Binaries for Cross-Modal Video Retrieval
    Qi, Mengshi
    Qin, Jie
    Yang, Yi
    Wang, Yunhong
    Luo, Jiebo
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2989 - 3004
  • [7] Attentive Relational Networks for Mapping Images to Scene Graphs
    Qi, Mengshi
    Li, Weijian
    Yang, Zhengyuan
    Wang, Yunhong
    Luo, Jiebo
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3952 - 3961
  • [8] STC-GAN: Spatio-Temporally Coupled Generative Adversarial Networks for Predictive Scene Parsing
    Qi, Mengshi
    Wang, Yunhong
    Li, Annan
    Luo, Jiebo
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 (29) : 5420 - 5430
  • [9] stagNet: An Attentive Semantic RNN for Group Activity Recognition
    Qi, Mengshi
    Qin, Jie
    Li, Annan
    Wang, Yunhong
    Luo, Jiebo
    Van Gool, Luc
    [J]. COMPUTER VISION - ECCV 2018, PT X, 2018, 11214 : 104 - 120
  • [10] Radosavovic I, 2020, PROC CVPR IEEE, P10425, DOI 10.1109/CVPR42600.2020.01044