Self-Similarity Action Proposal

被引:3
作者
Liu, Xiaolong [1 ]
Sun, Yuchao [2 ]
Lu, Jianghu [3 ]
Yao, Cong [2 ]
Zhou, Yu [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Elect Informat & Commun, Wuhan 430074, Peoples R China
[2] Megvii Co Ltd, Beijing 100190, Peoples R China
[3] Bigo Technol Pte Ltd, Beijing 100086, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Proposals; Generators; Image segmentation; Sampling methods; Motion segmentation; Feature extraction; Visualization; Action proposal; action recognition; self-similarity; temporal action detection; temporal action localization; ACTION RECOGNITION;
D O I
10.1109/LSP.2020.3037796
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Temporal action proposal generation, which aims to locate temporal segments that may contain actions, is a key prepositive step of various video analysis tasks, like temporal action detection. In this letter, we present Self-Similarity Action Proposal (SSAP), a simple method that generates action proposals using the self-similarity of videos. Specifically, a basic low-level index, structural similarity, is adopted to measure the similarity between adjacent frames. Potential action boundaries are located by thresholding the similarity values and candidate action segments are successively generated by grouping the boundaries. A segment evaluation module (SEM) is further employed to score and refine the segments. The framework achieves state-of-the-art performance on THUMOS14 and competitive results on ActivityNet v1.3. Notably, on THUMOS14, it achieves over 4% improvement on the average recall at 50 proposals and 3.3% gain in mAP@0.7 when combined with an existing action classifier for temporal action detection.
引用
收藏
页码:2064 / 2068
页数:5
相关论文
共 31 条
[1]  
[Anonymous], 2016, CUHK & ETHZ & SIAT submission to ActivityNet challenge 2016
[2]  
[Anonymous], 2014, ECCV WORKSH
[3]  
Bodla N, 2017, Arxiv, DOI arXiv:1704.04503
[4]   SST: Single-Stream Temporal Action Proposals [J].
Buch, Shyamal ;
Escorcia, Victor ;
Shen, Chuanqi ;
Ghanem, Bernard ;
Niebles, Juan Carlos .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6373-6382
[5]  
Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
[6]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[7]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[8]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577
[9]   DAPs: Deep Action Proposals for Action Understanding [J].
Escorcia, Victor ;
Heilbron, Fabian Caba ;
Niebles, Juan Carlos ;
Ghanem, Bernard .
COMPUTER VISION - ECCV 2016, PT III, 2016, 9907 :768-784
[10]   SlowFast Networks for Video Recognition [J].
Feichtenhofer, Christoph ;
Fan, Haoqi ;
Malik, Jitendra ;
He, Kaiming .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6201-6210