ALFPN: Adaptive Learning Feature Pyramid Network for Small Object Detection

被引：8

作者：

Chen, Haolin ^{[1
,2
]}

Wang, Qi ^{[1
,2
]}

Ruan, Weijian ^{[3
]}

Zhu, Jingxiang ^{[2
]}

Lei, Liang ^{[2
]}

Wu, Xue ^{[1
]}

Hao, Gefei ^{[1
]}

机构：

[1] Guizhou Univ, Coll Comp Sci & Technol, State Key Lab Publ Big Data, Guangzhou 550025, Guangdong, Peoples R China

[2] Guangdong Univ Technol, Sch Phys & Optoelect Engn, Guangzhou 510006, Guangdong, Peoples R China

[3] Chinese Acad Sci, Shenzhen Inst Adv Technol, Shenzhen 518055, Guangdong, Peoples R China

来源：

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS | 2023年 / 2023卷

基金：

中国国家自然科学基金;

关键词：

Classification (of information) - Feature extraction - Object recognition - Semantics;

D O I：

10.1155/2023/6266209

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detection has become a crucial technology in intelligent vision systems, enabling automatic detection of target objects. While most detectors perform well on open datasets, they often struggle with small-scale objects. This is due to the traditional top-down feature fusion methods that weaken the semantic and location information of small objects, leading to poor classification performance. To address this issue, we propose a novel feature pyramid network, the adaptive learnable feature pyramid network (ALFPN). Our approach features an adaptive feature inspection that incorporates learnable fusion coefficients in the fusion of different levels of feature layers, aiding the network in learning features with less noise. In addition, we construct a context-aligned supervisor that adjusts the feature maps fused at different levels to avoid scaling-related offset effects. Our experiments demonstrate that our method achieves state-of-the-art results and is highly robust for the small object detection on the TT-100K, PASCAL VOC, and COCO datasets. These findings indicate that a model's ability to extract discriminant features is positively correlated with its performance in detecting small objects.

引用

页数：14

共 50 条

[1] An Approach to Automatic Real-Time Novelty Detection, Object Identification, and Tracking in Video Streams Based on Recursive Density Estimation and Evolving Takagi-Sugeno Fuzzy Systems [J].

Angelov, Plamen ;

Sadeghi-Tehran, Pouria ;

Ramezani, Ramin .

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2011, 26 (03) :189-205

[2] SOD-MTGAN: Small Object Detection via Multi-Task Generative Adversarial Network [J].

Bai, Yancheng ;

Zhang, Yongqiang ;

Ding, Mingli ;

Ghanem, Bernard .

COMPUTER VISION - ECCV 2018, PT XIII, 2018, 11217 :210-226

[3] Guided Attention Network for Object Detection and Counting on Drones [J].

Cai, Yuanqiang ;

Du, Dawei ;

Zhang, Libo ;

Wen, Longyin ;

Wang, Weiqiang ;

Wu, Yanjun ;

Lyu, Siwei .

MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, :709-717

[4] DADCNet: Dual attention densely connected network for more accurate real iris region segmentation [J].

Chen, Ying ;

Gan, Huimin ;

Zeng, Zhuang ;

Chen, Huiling .

INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (01) :829-858

[5] Domain-Adaptive Object Detection via Uncertainty-Aware Distribution Alignment [J].

Dang-Khoa Nguyen ;

Tseng, Wei-Lun ;

Shuai, Hong-Han .

MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, :2499-2507

[6]

Everingham M., 2005, The 2005 PASCAL Visual Object Classes Challenge, V3944, P117

[7] NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection [J].

Ghiasi, Golnaz ;

Lin, Tsung-Yi ;

Le, Quoc V. .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :7029-7038

[8] Recent advances in convolutional neural networks [J].

Gu, Jiuxiang ;

Wang, Zhenhua ;

Kuen, Jason ;

Ma, Lianyang ;

Shahroudy, Amir ;

Shuai, Bing ;

Liu, Ting ;

Wang, Xingxing ;

Wang, Gang ;

Cai, Jianfei ;

Chen, Tsuhan .

PATTERN RECOGNITION, 2018, 77 :354-377

[9] AugFPN: Improving Multi-scale Feature Learning for Object Detection [J].

Guo, Chaoxu ;

Fan, Bin ;

Zhang, Qian ;

Xiang, Shiming ;

Pan, Chunhong .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12592-12601

[10] Exploiting Better Feature Aggregation for Video Object Detection [J].

Han, Liang ;

Wang, Pichao ;

Yin, Zhaozheng ;

Wang, Fan ;

Li, Hao .

MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, :1469-1477

← 1 2 3 4 5 →