Leveraging cross-resolution attention for effective extreme low-resolution video action recognition

被引：1

作者：

Oguz, Oguzhan ^{[1
]}

Ikizler-Cinbis, Nazli ^{[1
]}

机构：

[1] Hacettepe Univ, Dept Comp Engn, TR-06800 Ankara, Turkiye

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2024年 / 18卷 / 01期

关键词：

Extreme low-resolution action recognition; Knowledge distillation; Cross-resolution attention;

D O I：

10.1007/s11760-023-02766-x

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recognizing human actions in extremely low-resolution (eLR) videos poses a formidable challenge in the action recognition domain due to the lack of temporal and spatial information in the corresponding eLR frames. In this work, we propose a novel eLR video human action recognition architecture that recognize actions in an eLR setup. The proposed approach and its variants utilize an expanded knowledge distillation scheme that provides the essential flow of information from high-resolution (HR) frames to eLR frames. To further improve the generalization capability, we integrate cross-resolution attention modules that can operate without HR information during inference time. Additionally, we investigate the impact of an eLR data preprocessing pipeline that leverages a super-resolution algorithm and experimentally show the efficacy of the proposed models in eLR space. Our experiments indicate the importance of examining eLR human action recognition and demonstrate that the proposed methods can surpass and/or compete with the current state-of-the-art methods, achieving effective generalization capabilities on both UCF-101 and HMDB-51 datasets.

引用

页码：399 / 406

页数：8

共 27 条

[11] Diagnostic and Therapeutic Value of Hsa_circ_0002594 for T Helper 2-Mediated Allergic Asthma [J].

Huang, Zhenli ;

Fu, Bohua ;

Qi, Xuefei ;

Xu, Yuzhu ;

Mou, Yong ;

Zhou, Min ;

Cao, Yong ;

Wu, Guorao ;

Xie, Jungang ;

Zhao, Jianping ;

Wang, Yi ;

Xiong, Weining .

INTERNATIONAL ARCHIVES OF ALLERGY AND IMMUNOLOGY, 2021, 182 (05) :388-398

[12] Efficient Action Recognition via Dynamic Knowledge Propagation [J].

Kim, Hanul ;

Jain, Mihir ;

Lee, Jun-Tae ;

Yun, Sungrack ;

Porikli, Fatih .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13699-13708

[13]

Kuehne H, 2011, IEEE I CONF COMP VIS, P2556, DOI 10.1109/ICCV.2011.6126543

[14]

Liu T ..., 2023, IEEE Trans. Neural Netw. Learn. Syst., P1

[15] Video Swin Transformer [J].

Liu, Ze ;

Ning, Jia ;

Cao, Yue ;

Wei, Yixuan ;

Zhang, Zheng ;

Lin, Stephen ;

Hu, Han .

2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :3192-3201

[16] TAM: Temporal Adaptive Module for Video Recognition [J].

Liu, Zhaoyang ;

Wang, Limin ;

Wu, Wayne ;

Qian, Chen ;

Lu, Tong .

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, :13688-13698

[17]

Ma Chuofan, 2022, Advances in Neural Information Processing Systems

[18]

Purwanto D., 2019, P IEEE CVF INT C COM

[19] Three-Stream Network With Bidirectional Self-Attention for Action Recognition in Extreme Low Resolution Videos [J].

Purwanto, Didik ;

Pramono, Rizard Renanda Adhi ;

Chen, Yie-Tarng ;

Fang, Wen-Hsien .

IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (08) :1187-1191

[20] Learning to See Through a Few Pixels: Multi Streams Network for Extreme Low-Resolution Action Recognition [J].

Russo, Paolo ;

Ticca, Salvatore ;

Alati, Edoardo ;

Pirri, Fiora .

IEEE ACCESS, 2021, 9 :12019-12026

← 1 2 3 →