Leveraging cross-resolution attention for effective extreme low-resolution video action recognition

被引：0

作者：

Oguz, Oguzhan ^{[1
]}

Ikizler-Cinbis, Nazli ^{[1
]}

机构：

[1] Hacettepe Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / 01期

关键词：

Extreme low-resolution action recognition; Knowledge distillation; Cross-resolution attention;

D O I：

10.1007/s11760-023-02766-x

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recognizing human actions in extremely low-resolution (eLR) videos poses a formidable challenge in the action recognition domain due to the lack of temporal and spatial information in the corresponding eLR frames. In this work, we propose a novel eLR video human action recognition architecture that recognize actions in an eLR setup. The proposed approach and its variants utilize an expanded knowledge distillation scheme that provides the essential flow of information from high-resolution (HR) frames to eLR frames. To further improve the generalization capability, we integrate cross-resolution attention modules that can operate without HR information during inference time. Additionally, we investigate the impact of an eLR data preprocessing pipeline that leverages a super-resolution algorithm and experimentally show the efficacy of the proposed models in eLR space. Our experiments indicate the importance of examining eLR human action recognition and demonstrate that the proposed methods can surpass and/or compete with the current state-of-the-art methods, achieving effective generalization capabilities on both UCF-101 and HMDB-51 datasets.

引用

页码：399 / 406

页数：8

共 27 条

[1] Efficiently Approximating High-Dimensional Pareto Frontiers for Tree-Structured Networks Using Expansion and Compression [J].

Bai, Yiwei ;

Shi, Qinru ;

Grimson, Marc ;

Flecker, Alexander ;

Gomes, Carla P. .

INTEGRATION OF CONSTRAINT PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND OPERATIONS RESEARCH, CPAIOR 2023, 2023, 13884 :1-17

[2] Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].

Carreira, Joao ;

Zisserman, Andrew .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733

[3] Semi-Coupled Two-Stream Fusion ConvNets for Action Recognition at Extremely Low Resolutions [J].

Chen, Jiawei ;

Wu, Jonathan ;

Konrad, Janusz ;

Ishwar, Prakash .

2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, :139-147

[4] MARS: Motion-Augmented RGB Stream for Action Recognition [J].

Crasto, Nieves ;

Weinzaepfel, Philippe ;

Alahari, Karteek ;

Schmid, Cordelia .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7874-7883

[5] Learning an Augmented RGB Representation with Cross-Modal Knowledge Distillation for Action Detection [J].

Dai, Rui ;

Das, Srijan ;

Bremond, Francois .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13033-13044

[6] SPAct: Self-supervised Privacy Preservation for Action Recognition [J].

Dave, Ishan Rajendrakumar ;

Chen, Chen ;

Shah, Mubarak .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :20132-20141

[7] TinyVIRAT: Low-resolution Video Action Recognition [J].

Demir, Ugur ;

Rawat, Yogesh S. ;

Shah, Mubarak .

2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, :7387-7394

[8] SlowFast Networks for Video Recognition [J].

Feichtenhofer, Christoph ;

Fan, Haoqi ;

Malik, Jitendra ;

He, Kaiming .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6201-6210

[9]

Hinton Geoffrey, 2015, ARXIV

[10] Extreme Low-Resolution Activity Recognition Using a Super-Resolution-Oriented Generative Adversarial Network [J].

Hou, Mingzheng ;

Liu, Song ;

Zhou, Jiliu ;

Zhang, Yi ;

Feng, Ziliang .

MICROMACHINES, 2021, 12 (06)

← 1 2 3 →