AODGCN: Adaptive object detection with attention-guided dynamic graph convolutional network

被引:0
作者
Zhang, Meng [1 ]
Guo, Yina [1 ]
Wang, Haidong [1 ]
Shangguan, Hong [1 ]
机构
[1] Taiyuan Univ Sci & Technol, Sch Elect & Informat Engn, Taiyuan, Shanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Object detection; Image classification; Graphs; Attention mechanism; Graph convolutional network;
D O I
10.1016/j.cviu.2025.104386
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Various classifiers based on convolutional neural networks have been successfully applied to image classification in object detection. However, object detection is much more sophisticated and most classifiers used in this context exhibit limitations in capturing contextual information, particularly in scenarios with complex backgrounds or occlusions. Additionally, they lack spatial awareness, resulting in the loss of spatial structure and inadequate modeling of object details and context. In this paper, we propose an adaptive object detection approach using an attention-guided dynamic graph convolutional network (AODGCN). AODGCN represents images as graphs, enabling the capture of object properties such as connectivity, proximity, and hierarchical relationships. Attention mechanisms guide the model to focus on informative regions, highlighting relevant features while suppressing background information. This attention-guided approach enhances the model's ability to capture discriminative features. Furthermore, the dynamic graph convolutional network (D-GCN) adjusts the receptive field size and weight coefficients based on object characteristics, enabling adaptive detection of objects with varying sizes. The achieved results demonstrate the effectiveness of AODGCN on the MS-COCO 2017 dataset, with a significant improvement of 1.6% in terms of mean average precision (mAP) compared to state-of-the-art algorithms.
引用
收藏
页数:12
相关论文
共 40 条
[1]  
Bottou Leon, 2012, Neural Networks: Tricks of the Trade. Second Edition: LNCS 7700, P421, DOI 10.1007/978-3-642-35289-8_25
[2]   Cascade R-CNN: Delving into High Quality Object Detection [J].
Cai, Zhaowei ;
Vasconcelos, Nuno .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :6154-6162
[3]   End-to-End Object Detection with Transformers [J].
Carion, Nicolas ;
Massa, Francisco ;
Synnaeve, Gabriel ;
Usunier, Nicolas ;
Kirillov, Alexander ;
Zagoruyko, Sergey .
COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229
[4]  
Chen XN, 2023, Arxiv, DOI arXiv:2302.06675
[5]   Label-aware graph representation learning for multi-label image classification [J].
Chen, Yilu ;
Zou, Changzhong ;
Chen, Jianli .
NEUROCOMPUTING, 2022, 492 :50-61
[6]  
Ge Z, 2021, Arxiv, DOI [arXiv:2107.08430, DOI 10.48550/ARXIV.2107.08430]
[7]  
Gevorgyan Z, 2022, Arxiv, DOI arXiv:2205.12740
[8]  
He J, 2021, ADV NEUR IN, V34
[9]  
He KM, 2020, IEEE T PATTERN ANAL, V42, P386, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
[10]  
Jocher G., 2022, YOLOv5 release v6.1