N-IoU: better IoU-based bounding box regression loss for object detection

被引：17

作者：

Su, Keke ^{[1
,2
]}

Cao, Lihua ^{[1
]}

Zhao, Botong ^{[3
]}

Li, Ning ^{[1
]}

Wu, Di ^{[1
]}

Han, Xiyu ^{[1
]}

机构：

[1] Chinese Acad Sci, Changchun Inst Opt Fine Mech & Phys, Southeast Lake Rd, Changchun 130033, Peoples R China

[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China

[3] East China Normal Univ, Sch Commun & Elect Engn, Shanghai 200241, Peoples R China

来源：

NEURAL COMPUTING & APPLICATIONS | 2024年 / 36卷 / 06期

关键词：

Bounding box regression; IoU; N-IoU; Regression loss; Object detection; Computer vision; NETWORKS;

D O I：

10.1007/s00521-023-09133-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Object detection is one of the core tasks of computer vision, and bounding box (bbox) regression is one of the basic tasks of object detection. In recent years of related research, bbox regression is often used in the Intersection over Union (IoU) loss and its improved version. In this paper, for the first time, we introduce the Dice coefficient into the regression loss calculation and propose a new measure which is superior to and can replace the IoU. We define three properties of the new measure and prove the theory by mathematical reasoning and analysis of the existing work. This paper also proposes the N-IoU regression loss family. And the superiority of the N-IoU regression loss family is proved by designing simulation experiments and comparative experiments. The main results of this paper are: (1) The proposed new measure is better than IoU which can be used to evaluate bounding box regression, and the three properties of the new measure can be used as a broad criterion for the design of regression loss functions; and (2) we propose N-IoU loss. The parameter n of N-IOU can be debugged, which can be widely adapted to different application scenarios with higher flexibility, and the regression performance is better.

引用

页码：3049 / 3063

页数：15

共 54 条

[1]

Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, 10.48550/arXiv.2004.10934, DOI 10.48550/ARXIV.2004.10934]

[2]

Cai ZW, 2017, Arxiv, DOI [arXiv:1712.00726, DOI 10.48550/ARXIV.1712.00726]

[3] End-to-End Object Detection with Transformers [J].

Carion, Nicolas ;

Massa, Francisco ;

Synnaeve, Gabriel ;

Usunier, Nicolas ;

Kirillov, Alexander ;

Zagoruyko, Sergey .

COMPUTER VISION - ECCV 2020, PT I, 2020, 12346 :213-229

[4] Hybrid Task Cascade for Instance Segmentation [J].

Chen, Kai ;

Pang, Jiangmiao ;

Wang, Jiaqi ;

Xiong, Yu ;

Li, Xiaoxiao ;

Sun, Shuyang ;

Feng, Wansen ;

Liu, Ziwei ;

Shi, Jianping ;

Ouyang, Wanli ;

Loy, Chen Change ;

Lin, Dahua .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :4969-4978

[5] Towards Large-Scale Small Object Detection: Survey and Benchmarks [J].

Cheng, Gong ;

Yuan, Xiang ;

Yao, Xiwen ;

Yan, Kebing ;

Zeng, Qinghua ;

Xie, Xingxing ;

Han, Junwei .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (11) :13467-13488

[6]

Duan K, 2020, EUROPEAN C COMPUTER

[7] CenterNet: Keypoint Triplets for Object Detection [J].

Duan, Kaiwen ;

Bai, Song ;

Xie, Lingxi ;

Qi, Honggang ;

Huang, Qingming ;

Tian, Qi .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577

[8] The Pascal Visual Object Classes (VOC) Challenge [J].

Everingham, Mark ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) :303-338

[9]

Fu C.Y., 2017, PREPRINT

[10] Improved YOLOX for pedestrian detection in crowded scenes [J].

Gao, Fei ;

Cai, Changxin ;

Jia, Ruohui ;

Hu, Xinzhong .

JOURNAL OF REAL-TIME IMAGE PROCESSING, 2023, 20 (02)

← 1 2 3 4 5 6 →