Adversarial Background-Aware Loss for Weakly-Supervised Temporal Activity Localization

被引:91
作者
Min, Kyle [1 ]
Corso, Jason J. [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
来源
COMPUTER VISION - ECCV 2020, PT XIV | 2020年 / 12359卷
关键词
A2CL-PT; Temporal activity localization; Adversarial learning; Weakly-supervised learning; Center loss with a pair of triplets;
D O I
10.1007/978-3-030-58568-6_17
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Temporally localizing activities within untrimmed videos has been extensively studied in recent years. Despite recent advances, existing methods for weakly-supervised temporal activity localization struggle to recognize when an activity is not occurring. To address this issue, we propose a novel method named A2CL-PT. Two triplets of the feature space are considered in our approach: one triplet is used to learn discriminative features for each activity class, and the other one is used to distinguish the features where no activity occurs (i.e. background features) from activity-related features for each video. To further improve the performance, we build our network using two parallel branches which operate in an adversarial way: the first branch localizes the most salient activities of a video and the second one finds other supplementary activities from non-localized parts of the video. Extensive experiments performed on THUMOS14 and ActivityNet datasets demonstrate that our proposed method is effective. Specifically, the average mAP of IoU thresholds from 0.1 to 0.9 on the THUMOS14 dataset is significantly improved from 27.9% to 30.0%.
引用
收藏
页码:283 / 299
页数:17
相关论文
共 32 条
[1]  
Heilbron FC, 2015, PROC CVPR IEEE, P961, DOI 10.1109/CVPR.2015.7298698
[2]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[3]   Rethinking the Faster R-CNN Architecture for Temporal Action Localization [J].
Chao, Yu-Wei ;
Vijayanarasimhan, Sudheendra ;
Seybold, Bryan ;
Ross, David A. ;
Deng, Jia ;
Sukthankar, Rahul .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1130-1139
[4]   Triplet-Center Loss for Multi-View 3D Object Retrieval [J].
He, Xinwei ;
Zhou, Yang ;
Zhou, Zhichao ;
Bai, Song ;
Bai, Xiang .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :1945-1954
[5]  
Jiang Y. G, 2014, PROC EUR C COMPUT VI, P1
[6]  
Kay W, 2017, Arxiv, DOI [arXiv:1705.06950, 10.48550/arXiv.1705.06950]
[7]  
King DB, 2015, ACS SYM SER, V1214, P1, DOI 10.1021/bk-2015-1214.ch001
[8]  
Kuehne H, 2011, IEEE I CONF COMP VIS, P2556, DOI 10.1109/ICCV.2011.6126543
[9]  
Lee P, 2020, AAAI CONF ARTIF INTE, V34, P11320
[10]  
Li ZQ, 2019, AAAI CONF ARTIF INTE, P8682