共 34 条
[1]
ViViT: A Video Vision Transformer
[J].
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021),
2021,
:6816-6826
[2]
End-to-End Object Detection with Transformers
[J].
COMPUTER VISION - ECCV 2020, PT I,
2020, 12346
:213-229
[3]
Chen P., 2021, P IEEECVF INT C COMP, P11833
[4]
Chi C, 2020, AAAI CONF ARTIF INTE, V34, P10639
[5]
Pedestrian Attribute Recognition At Far Distance
[J].
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14),
2014,
:789-792
[6]
Dosovitskiy A., 2021, P 9 INT C LEARN REPR
[7]
Eom C., 2019, Advances in neural information processing systems, V32, P1
[8]
Visual Attention Consistency under Image Transforms for Multi-Label Image Classification
[J].
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019),
2019,
:729-739
[9]
Deep Residual Learning for Image Recognition
[J].
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR),
2016,
:770-778
[10]
Jia J., 2021, PROC INT C COMPUT VI, P962