ASTRA: An Action Spotting TRAnsformer for Soccer Videos

被引：3

作者：

Xarles, Artur ^{[1
,2
]}

Escalera, Sergio ^{[1
,2
,3
]}

Moeslund, Thomas B. ^{[3
]}

Clapes, Albert ^{[1
,2
]}

机构：

[1] Univ Barcelona, Barcelona, Spain

[2] Comp Vis Ctr, Barcelona, Spain

[3] Aalborg Univ, Aalborg, Denmark

来源：

PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON MULTIMEDIA CONTENT ANALYSIS IN SPORTS, MMSPORTS 2023 | 2023年

关键词：

computer vision; action spotting; transformer encoder-decoder; uncertainty estimation; balanced mixup;

D O I：

10.1145/3606038.3616153

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In this paper, we introduce ASTRA, a Transformer-based model designed for the task of Action Spotting in soccer matches. ASTRA addresses several challenges inherent in the task and dataset, including the requirement for precise action localization, the presence of a long-tail data distribution, non-visibility in certain actions, and inherent label noise. To do so, ASTRA incorporates (a) a Transformer encoder-decoder architecture to achieve the desired output temporal resolution and to produce precise predictions, (b) a balanced mixup strategy to handle the long-tail distribution of the data, (c) an uncertainty-aware displacement head to capture the label variability, and (d) input audio signal to enhance detection of non-visible actions. Results demonstrate the effectiveness of ASTRA, achieving a tight Average-mAP of 66.82 on the test set. Moreover, in the SoccerNet 2023 Action Spotting challenge, we secure the 3rd position with an Average-mAP of 70.21 on the challenge set.

引用

页码：93 / 102

页数：10

共 36 条

[11] Efficient Action Spotting Using Saliency Feature Weighting
Shi, Yuzhi
Yamashita, Takayoshi
Hirakawa, Tsubasa
Fujiyoshi, Hironobu
Nakazawa, Mitsuru
Chae, Yeongnam
Stenger, Bjorn
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2024, E107D (01) : 105 - 114
[12] Action Spotting and Recognition Based on a Spatiotemporal Orientation Analysis
Derpanis, Konstantinos G.
Sizintsev, Mikhail
Cannons, Kevin J.
Wildes, Richard P.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (03) : 527 - 540
[13] Structured Learning for Action Recognition in Videos
Long, Yinghan
Srinivasan, Gopalakrishnan
Panda, Priyadarshini
Roy, Kaushik
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (03) : 475 - 484
[14] Egocentric action anticipation from untrimmed videos
Rodin, Ivan
Furnari, Antonino
Farinella, Giovanni Maria
IET COMPUTER VISION, 2025, 19 (01)
[15] An Automated System for Generating Tactical Performance Statistics for Individual Soccer Players From Videos
Theagarajan, Rajkumar
Bhanu, Bir
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (02) : 632 - 646
[16] MPEG CDVS Feature Trajectories for Action Recognition in Videos
Dasari, Radhakrishna
Chen, Chang Wen
IEEE 1ST CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2018), 2018, : 301 - 304
[17] Automatic summarization of cooking videos using transfer learning and transformer-based models
P. M. Alen Sadique
R. V. Aswiga
Discover Artificial Intelligence, 5 (1):
[18] Action Recognition in Videos through a Transfer-Learning-Based Technique
Lopez-Lozada, Elizabeth
Sossa, Humberto
Rubio-Espino, Elsa
Montiel-Perez, Jesus Yalja
MATHEMATICS, 2024, 12 (20)
[19] Rethinking Online Action Detection in Untrimmed Videos: A Novel Online Evaluation Protocol
Baptista-Rios, Marcos
Lopez-Sastre, Roberto J.
Caba Heilbron, Fabian
Van Gemert, Jan C.
Acevedo-Rodriguez, F. Javier
Maldonado-Bascon, Saturnino
IEEE ACCESS, 2020, 8 : 5139 - 5146
[20] Top-down attention recurrent VLAD encoding for action recognition in videos
Sudhakaran, Swathikiran
Lanz, Oswald
INTELLIGENZA ARTIFICIALE, 2019, 13 (01) : 107 - 118

← 1 2 3 4 →