MiniROAD: Minimal RNN Framework for Online Action Detection

被引:11
作者
An, Joungbin [1 ]
Kang, Hyolim [1 ]
Han, Su Ho [1 ]
Yang, Ming-Hsuan [1 ,2 ,3 ]
Kim, Seon Joo [1 ]
机构
[1] Yonsei Univ, Seoul, South Korea
[2] UC Merced, Merced, CA USA
[3] Google Res, Mountain View, CA USA
来源
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023) | 2023年
关键词
D O I
10.1109/ICCV51070.2023.00949
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Online Action Detection (OAD) is the task of identifying actions in streaming videos without access to future frames. Much effort has been devoted to effectively capturing long-range dependencies, with transformers receiving the spotlight for their ability to capture long-range temporal structures. In contrast, RNNs have received less attention lately, due to their lower performance compared to recent methods that utilize transformers. In this paper, we investigate the underlying reasons for the inferior performance of RNNs compared to transformer-based algorithms. Our findings indicate that the discrepancy between training and inference is the primary hindrance to the effective training of RNNs. To address this, we propose applying non-uniform weights to the loss computed at each time step, which allows the RNN model to learn from the predictions made in an environment that better resembles the inference stage. Extensive experiments on three benchmark datasets, THUMOS, TVSeries, and FineAction demonstrate that a minimal RNN-based model trained with the proposed methodology performs equally or better than the existing best methods with a significant increase in efficiency. The code is available at https://github.com/jbistanbul/MiniROAD.
引用
收藏
页码:10307 / 10316
页数:10
相关论文
共 44 条
[21]   Temporally smooth online action detection using cycle-consistent future anticipation [J].
Kim, Young Hwi ;
Nam, Seonghyeon ;
Kim, Seon Joo .
PATTERN RECOGNITION, 2021, 116
[22]   1D convolutional neural networks and applications: A survey [J].
Kiranyaz, Serkan ;
Avci, Onur ;
Abdeljaber, Osama ;
Ince, Turker ;
Gabbouj, Moncef ;
Inman, Daniel J. .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2021, 151
[23]  
Lipton Z.C., 2015, ARXIV
[24]  
Liu Yi, 2022, IEEE transactions on image processing
[25]  
MMAction2 Contributors, 2020, OP NEXT GEN VID UND
[26]  
Orvieto Antonio, 2023, ARXIV230306349
[27]   Self-trained Deep Ordinal Regression for End-to-End Video Anomaly Detection [J].
Pang, Guansong ;
Yan, Cheng ;
Shen, Chunhua ;
van den Hengel, Anton ;
Bai, Xiao .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12170-12179
[28]   Learning Memory-guided Normality for Anomaly Detection [J].
Park, Hyunjong ;
Noh, Jongyoun ;
Ham, Bumsub .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :14360-14369
[29]  
Peng B., 2023, ARXIV230513048
[30]  
Qu Sanqing, 2020, ARXIV201107915