Corner-Point and Foreground-Area IoU Loss: Better Localization of Small Objects in Bounding Box Regression

被引:6
作者
Cai, Delong [1 ,2 ]
Zhang, Zhaoyun [1 ]
Zhang, Zhi [1 ]
机构
[1] DongGuan Univ Technol, Sch Elect Engn & Intelligentizat, Dongguan 523000, Peoples R China
[2] DongGuan Univ Technol, Sch Comp Sci & Technol, Dongguan 523000, Peoples R China
关键词
object detection; loss function; small object; bounding box regression;
D O I
10.3390/s23104961
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Bounding box regression is a crucial step in object detection, directly affecting the localization performance of the detected objects. Especially in small object detection, an excellent bounding box regression loss can significantly alleviate the problem of missing small objects. However, there are two major problems with the broad Intersection over Union (IoU) losses, also known as Broad IoU losses (BIoU losses) in bounding box regression: (i) BIoU losses cannot provide more effective fitting information for predicted boxes as they approach the target box, resulting in slow convergence and inaccurate regression results; (ii) most localization loss functions do not fully utilize the spatial information of the target, namely the target's foreground area, during the fitting process. Therefore, this paper proposes the Corner-point and Foreground-area IoU loss (CFIoU loss) function by delving into the potential for bounding box regression losses to overcome these issues. First, we use the normalized corner point distance between the two boxes instead of the normalized center-point distance used in the BIoU losses, which effectively suppresses the problem of BIoU losses degrading to IoU loss when the two boxes are close. Second, we add adaptive target information to the loss function to provide richer target information to optimize the bounding box regression process, especially for small object detection. Finally, we conducted simulation experiments on bounding box regression to validate our hypothesis. At the same time, we conducted quantitative comparisons of the current mainstream BIoU losses and our proposed CFIoU loss on the small object public datasets VisDrone2019 and SODA-D using the latest anchor-based YOLOv5 and anchor-free YOLOv8 object detection algorithms. The experimental results demonstrate that YOLOv5s (+3.12% Recall, +2.73% mAP@0.5, and +1.91% mAP@0.5:0.95) and YOLOv8s (+1.72% Recall and +0.60% mAP@0.5), both incorporating the CFIoU loss, achieved the highest performance improvement on the VisDrone2019 test set. Similarly, YOLOv5s (+6% Recall, +13.08% mAP@0.5, and +14.29% mAP@0.5:0.95) and YOLOv8s (+3.36% Recall, +3.66% mAP@0.5, and +4.05% mAP@0.5:0.95), both incorporating the CFIoU loss, also achieved the highest performance improvement on the SODA-D test set. These results indicate the effectiveness and superiority of the CFIoU loss in small object detection. Additionally, we conducted comparative experiments by fusing the CFIoU loss and the BIoU loss with the SSD algorithm, which is not proficient in small object detection. The experimental results demonstrate that the SSD algorithm incorporating the CFIoU loss achieved the highest improvement in the AP (+5.59%) and AP75 (+5.37%) metrics, indicating that the CFIoU loss can also improve the performance of algorithms that are not proficient in small object detection.
引用
收藏
页数:17
相关论文
共 24 条
  • [1] Bae SH, 2019, AAAI CONF ARTIF INTE, P8094
  • [2] Towards Accurate One-Stage Object Detection with AP-Loss
    Chen, Kean
    Li, Jianguo
    Lin, Weiyao
    See, John
    Wang, Ji
    Duan, Lingyu
    Chen, Zhibo
    He, Changwei
    Zou, Junni
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5114 - 5122
  • [3] Cheng G, 2023, Arxiv, DOI arXiv:2207.14096
  • [4] VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results
    Du, Dawei
    Zhu, Pengfei
    Wen, Longyin
    Bian, Xiao
    Ling, Haibin
    Hu, Qinghua
    Peng, Tao
    Zheng, Jiayu
    Wang, Xinyao
    Zhang, Yue
    Bo, Liefeng
    Shi, Hailin
    Zhu, Rui
    Kumar, Aashish
    Li, Aijin
    Zinollayev, Almaz
    Askergaliyev, Anuar
    Schumann, Arne
    Mao, Binjie
    Lee, Byeongwon
    Liu, Chang
    Chen, Changrui
    Pan, Chunhong
    Huo, Chunlei
    Yu, Da
    Cong, Dechun
    Zeng, Dening
    Pailla, Dheeraj Reddy
    Li, Di
    Wang, Dong
    Cho, Donghyeon
    Zhang, Dongyu
    Bai, Furui
    Jose, George
    Gao, Guangyu
    Liu, Guizhong
    Xiong, Haitao
    Qi, Hao
    Wang, Haoran
    Qiu, Heqian
    Li, Hongliang
    Lu, Huchuan
    Kim, Ildoo
    Kim, Jaekyum
    Shen, Jane
    Lee, Jihoon
    Ge, Jing
    Xu, Jingjing
    Zhou, Jingkai
    Meier, Jonas
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 213 - 226
  • [5] Fast R-CNN
    Girshick, Ross
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
  • [6] github, Ultralytics YOLOv8
  • [7] He KM, 2017, IEEE I CONF COMP VIS, P2980, DOI [10.1109/TPAMI.2018.2844175, 10.1109/ICCV.2017.322]
  • [8] CornerNet: Detecting Objects as Paired Keypoints
    Law, Hei
    Deng, Jia
    [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 765 - 781
  • [9] Focal Loss for Dense Object Detection
    Lin, Tsung-Yi
    Goyal, Priya
    Girshick, Ross
    He, Kaiming
    Dollar, Piotr
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2999 - 3007
  • [10] SSD: Single Shot MultiBox Detector
    Liu, Wei
    Anguelov, Dragomir
    Erhan, Dumitru
    Szegedy, Christian
    Reed, Scott
    Fu, Cheng-Yang
    Berg, Alexander C.
    [J]. COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 21 - 37