ASTRA: An Action Spotting TRAnsformer for Soccer Videos

被引:3
|
作者
Xarles, Artur [1 ,2 ]
Escalera, Sergio [1 ,2 ,3 ]
Moeslund, Thomas B. [3 ]
Clapes, Albert [1 ,2 ]
机构
[1] Univ Barcelona, Barcelona, Spain
[2] Comp Vis Ctr, Barcelona, Spain
[3] Aalborg Univ, Aalborg, Denmark
来源
PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON MULTIMEDIA CONTENT ANALYSIS IN SPORTS, MMSPORTS 2023 | 2023年
关键词
computer vision; action spotting; transformer encoder-decoder; uncertainty estimation; balanced mixup;
D O I
10.1145/3606038.3616153
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper, we introduce ASTRA, a Transformer-based model designed for the task of Action Spotting in soccer matches. ASTRA addresses several challenges inherent in the task and dataset, including the requirement for precise action localization, the presence of a long-tail data distribution, non-visibility in certain actions, and inherent label noise. To do so, ASTRA incorporates (a) a Transformer encoder-decoder architecture to achieve the desired output temporal resolution and to produce precise predictions, (b) a balanced mixup strategy to handle the long-tail distribution of the data, (c) an uncertainty-aware displacement head to capture the label variability, and (d) input audio signal to enhance detection of non-visible actions. Results demonstrate the effectiveness of ASTRA, achieving a tight Average-mAP of 66.82 on the test set. Moreover, in the SoccerNet 2023 Action Spotting challenge, we secure the 3rd position with an Average-mAP of 70.21 on the challenge set.
引用
收藏
页码:93 / 102
页数:10
相关论文
共 36 条
  • [11] Efficient Action Spotting Using Saliency Feature Weighting
    Shi, Yuzhi
    Yamashita, Takayoshi
    Hirakawa, Tsubasa
    Fujiyoshi, Hironobu
    Nakazawa, Mitsuru
    Chae, Yeongnam
    Stenger, Bjorn
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 105 - 114
  • [12] Action Spotting and Recognition Based on a Spatiotemporal Orientation Analysis
    Derpanis, Konstantinos G.
    Sizintsev, Mikhail
    Cannons, Kevin J.
    Wildes, Richard P.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (03) : 527 - 540
  • [13] Structured Learning for Action Recognition in Videos
    Long, Yinghan
    Srinivasan, Gopalakrishnan
    Panda, Priyadarshini
    Roy, Kaushik
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (03) : 475 - 484
  • [14] Egocentric action anticipation from untrimmed videos
    Rodin, Ivan
    Furnari, Antonino
    Farinella, Giovanni Maria
    IET COMPUTER VISION, 2025, 19 (01)
  • [15] An Automated System for Generating Tactical Performance Statistics for Individual Soccer Players From Videos
    Theagarajan, Rajkumar
    Bhanu, Bir
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) : 632 - 646
  • [16] MPEG CDVS Feature Trajectories for Action Recognition in Videos
    Dasari, Radhakrishna
    Chen, Chang Wen
    IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018), 2018, : 301 - 304
  • [17] Automatic summarization of cooking videos using transfer learning and transformer-based models
    P. M. Alen Sadique
    R. V. Aswiga
    Discover Artificial Intelligence, 5 (1):
  • [18] Action Recognition in Videos through a Transfer-Learning-Based Technique
    Lopez-Lozada, Elizabeth
    Sossa, Humberto
    Rubio-Espino, Elsa
    Montiel-Perez, Jesus Yalja
    MATHEMATICS, 2024, 12 (20)
  • [19] Rethinking Online Action Detection in Untrimmed Videos: A Novel Online Evaluation Protocol
    Baptista-Rios, Marcos
    Lopez-Sastre, Roberto J.
    Caba Heilbron, Fabian
    Van Gemert, Jan C.
    Acevedo-Rodriguez, F. Javier
    Maldonado-Bascon, Saturnino
    IEEE ACCESS, 2020, 8 : 5139 - 5146
  • [20] Top-down attention recurrent VLAD encoding for action recognition in videos
    Sudhakaran, Swathikiran
    Lanz, Oswald
    INTELLIGENZA ARTIFICIALE, 2019, 13 (01) : 107 - 118