Zero-shot action recognition by clustered representation with redundancy-free features

被引:0
作者
Xia, Limin [1 ]
Wen, Xin [1 ]
机构
[1] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Zero-shot learning; Generative adversarial networks; Cluster optimization;
D O I
10.1007/s00138-023-01470-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Zero-shot action recognition (ZSAR) is a practical and challenging issue, which compensates for the shortcomings of existing action recognition by being able to recognize those action classes that don't have visual representation during training. However, existing zero-shot action recognition doesn't focus on the fact that the generated features have many outliers, which harms the recognition. A new method for zero-shot action recognition is proposed, which suppresses this defect by clustered representation with redundancy-free features. In addition, a generative adversarial network (GAN) with gradient penalty is trained to synthesize stable features, solving the problem of data imbalance and alleviating the bottleneck of unstable features generated in existing methods. To reduce the dimension and the subsequent computation, a redundancy-free feature is introduced into the ZSAR. Experiments performed on Olympic Sports, HMDB51, and UCF101 public datasets prove that our method outperforms the state-of-the-art approaches with absolute gains of 1.8%, 0.3%, and 1.7%, respectively, in zero-shot action recognition.
引用
收藏
页数:18
相关论文
共 61 条
[1]  
Alemi AA, 2019, Arxiv, DOI arXiv:1612.00410
[2]  
Akata Z, 2015, PROC CVPR IEEE, P2927, DOI 10.1109/CVPR.2015.7298911
[3]   Label-Embedding for Attribute-Based Classification [J].
Akata, Zeynep ;
Perronnin, Florent ;
Harchaoui, Zaid ;
Schmid, Cordelia .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :819-826
[4]  
Arjovsky M, 2017, PR MACH LEARN RES, V70
[5]   Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications [J].
Brattoli, Biagio ;
Tighe, Joseph ;
Zhdanov, Fedor ;
Perona, Pietro ;
Chalupka, Krzysztof .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :4612-4622
[6]   Generating Visual Representations for Zero-Shot Classification [J].
Bucher, Maxime ;
Herbin, Stephane ;
Jurie, Frederic .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, :2666-2673
[7]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[8]  
Doshi K, 2022, Arxiv, DOI arXiv:2203.05156
[9]   Learning Spatiotemporal Features with 3D Convolutional Networks [J].
Du Tran ;
Bourdev, Lubomir ;
Fergus, Rob ;
Torresani, Lorenzo ;
Paluri, Manohar .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :4489-4497
[10]   A sampling-based approach for efficient clustering in large datasets [J].
Exarchakis, Georgios ;
Oubari, Omar ;
Lenz, Gregor .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, :12393-12402