Transformer-Based Context Condensation for Boosting Feature Pyramids in Object Detection

被引：0

作者：

Zhe Chen

Jing Zhang

Yufei Xu

Dacheng Tao

机构：

[1] The University of Sydney,Faculty of Engineering, School of Computer Science

来源：

International Journal of Computer Vision | 2023年 / 131卷

关键词：

Object detection; Feature pyramid; Context modeling; 35A01; 65L10; 65L12; 65L20; 65L70;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Current object detectors typically have a feature pyramid (FP) module for multi-level feature fusion (MFF) which aims to mitigate the gap between features from different levels and form a comprehensive object representation to achieve better detection performance. However, they usually require heavy cross-level connections or iterative refinement to obtain better MFF result, making them complicated in structure and inefficient in computation. To address these issues, we propose a novel and efficient context modeling mechanism that can help existing FPs deliver better MFF results while reducing the computational costs effectively. In particular, we introduce a novel insight that comprehensive contexts can be decomposed and condensed into two types of representations for higher efficiency. The two representations include a locally concentrated representation and a globally summarized representation, where the former focuses on extracting context cues from nearby areas while the latter extracts general contextual representations of the whole image scene as global context cues. By collecting the condensed contexts, we employ a Transformer decoder to investigate the relations between them and each local feature from the FP and then refine the MFF results accordingly. As a result, we obtain a simple and light-weight Transformer-based Context Condensation (TCC) module, which can boost various FPs and lower their computational costs simultaneously. Extensive experimental results on the challenging MS COCO dataset show that TCC is compatible to four representative FPs and consistently improves their detection accuracy by up to 7.8% in terms of average precision and reduce their complexities by up to around 20% in terms of GFLOPs, helping them achieve state-of-the-art performance more efficiently. Code will be released at https://github.com/zhechen/TCC.

引用

页码：2738 / 2756

页数：18

共 50 条

[1] Transformer-Based Context Condensation for Boosting Feature Pyramids in Object Detection
Chen, Zhe
Zhang, Jing
Xu, Yufei
Tao, Dacheng
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (10) : 2738 - 2756
[2] Boosting Salient Object Detection With Transformer-Based Asymmetric Bilateral U-Net
Qiu, Yu
Liu, Yun
Zhang, Le
Lu, Haotian
Xu, Jing
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2332 - 2345
[3] Fast Feature Pyramids for Object Detection
Dollar, Piotr
Appel, Ron
Belongie, Serge
Perona, Pietro
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (08) : 1532 - 1545
[4] A Novel Transformer-Based Adaptive Object Detection Method
Su, Shuzhi
Chen, Runbin
Fang, Xianjin
Zhang, Tian
ELECTRONICS, 2023, 12 (03)
[5] Recurrent DETR: Transformer-Based Object Detection for Crowded Scenes
Choi, Hyeong Kyu
Paik, Chong Keun
Ko, Hyun Woo
Park, Min-Chul
Kim, Hyunwoo J.
IEEE ACCESS, 2023, 11 : 78623 - 78643
[6] Transformer-based Cross Reference Network for video salient object detection
Huang, Kan
Tian, Chunwei
Su, Jingyong
Lin, Jerry Chun-Wei
PATTERN RECOGNITION LETTERS, 2022, 160 : 122 - 127
[7] ACT-FRCNN: Progress Towards Transformer-Based Object Detection
Zulfqar, Sukana
Elgamal, Zenab
Zia, Muhammad Azam
Razzaq, Abdul
Ullah, Sami
Dawood, Hussain
ALGORITHMS, 2024, 17 (11)
[8] Transformer-based few-shot object detection in traffic scenarios
Erjun Sun
Di Zhou
Yan Tian
Zhaocheng Xu
Xun Wang
Applied Intelligence, 2024, 54 : 947 - 958
[9] Transformer-based End-to-End Object Detection in Aerial Images
Vo, Nguyen D.
Le, Nguyen
Ngo, Giang
Doan, Du
Le, Do
Nguyen, Khang
INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (10) : 1072 - 1079
[10] Transformer-based few-shot object detection in traffic scenarios
Sun, Erjun
Zhou, Di
Tian, Yan
Xu, Zhaocheng
Wang, Xun
APPLIED INTELLIGENCE, 2024, 54 (01) : 947 - 958

← 1 2 3 4 5 →