Gaussian Combined Distance: A Generic Metric for Object Detection

被引:0
|
作者
Guan, Ziqian [1 ]
Fu, Xieyi [1 ]
Huang, Pengjun [1 ]
Zhang, Hengyuan [1 ]
Du, Hubin [1 ]
Liu, Yongtao [1 ]
Wang, Yinglin [2 ]
Ma, Qang [2 ]
机构
[1] North China Inst Sci & Technol, Key Lab Special Robots Safety Prod & Emergency Dis, Langfang 065201, Peoples R China
[2] Hegang Ind Technol Serv Co Ltd, Langfang 065008, Peoples R China
关键词
Measurement; Object detection; Feature extraction; Optimization; Detectors; Geoscience and remote sensing; Accuracy; Training; Sensitivity; Convergence; Generic metric; tiny object detection;
D O I
10.1109/LGRS.2025.3531970
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
In object detection, a well-defined similarity metric can significantly enhance the model performance. Currently, the intersection over union (IoU)-based similarity metric is the most commonly preferred choice for detectors. However, detectors using IoU as a similarity metric often perform poorly when detecting small objects because of their sensitivity to minor positional deviations. To address this issue, recent studies have proposed the Wasserstein distance (WD) as an alternative to IoU for measuring the similarity of Gaussian-distributed bounding boxes. However, we have observed that the WD lacks scale invariance, which negatively impacts the model's generalization capability. In addition, when used as a loss function, its independent optimization of the center attributes leads to slow model convergence and unsatisfactory detection precision. To address these challenges, we introduce the Gaussian Combined Distance (GCD). Through analytical examination of GCD and its gradient, we demonstrate that GCD not only possesses scale invariance but also facilitates joint optimization, which enhances model localization performance. Extensive experiments on the AI-TOD-v2 dataset for tiny object detection show that GCD, as a bounding box regression loss function and label assignment metric, achieves state-of-the-art (SOTA) performance across various detectors. We further validated the generalizability of GCD on the MS-COCO-2017 and Visdrone-2019 datasets, where it outperforms the WD across diverse scales of datasets. The code is available at: https://github.com/MArKkwanGuan/mmdet-GCD.
引用
收藏
页数:5
相关论文
共 50 条
  • [21] Scene Adaptive SAR Incremental Target Detection via Context-Aware Attention and Gaussian-Box Similarity Metric
    Tian, Yu
    Zhou, Zheng
    Cui, Zongyong
    Cao, Zongjie
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
  • [22] Siamese-DETR for Generic Multi-Object Tracking
    Liu, Qiankun
    Li, Yichen
    Jiang, Yuqi
    Fu, Ying
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 3935 - 3949
  • [23] Cross-Modality Object Detection Based on DETR
    Huang, Xinyi
    Ma, Guochun
    IEEE ACCESS, 2025, 13 : 51220 - 51230
  • [24] Video Object Detection Guided by Object Blur Evaluation
    Wu, Yujie
    Zhang, Hong
    Li, Yawei
    Yang, Yifan
    Yuan, Ding
    IEEE ACCESS, 2020, 8 : 208554 - 208565
  • [25] CBASH: Combined Backbone and Advanced Selection Heads With Object Semantic Proposals for Weakly Supervised Object Detection
    Xia, Ruiyang
    Li, Guoquan
    Huang, Zhengwen
    Meng, Hongying
    Pang, Yu
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6502 - 6514
  • [26] Dual Appearance-Aware Enhancement for Oriented Object Detection
    Gong, Maoguo
    Zhao, Hongyu
    Wu, Yue
    Tang, Zedong
    Feng, Kai-Yuan
    Sheng, Kai
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 14
  • [27] Explicit Margin Equilibrium for Few-Shot Object Detection
    Liu, Chang
    Li, Bohao
    Shi, Mengnan
    Chen, Xiaozhong
    Ye, Qixiang
    Ji, Xiangyang
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [28] CrossDet plus plus : Growing Crossline Representation for Object Detection
    Qiu, Heqian
    Li, Hongliang
    Wu, Qingbo
    Cui, Jianhua
    Song, Zichen
    Wang, Lanxiao
    Zhang, Minjian
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1093 - 1108
  • [29] Temporal Speciation Network for Few-Shot Object Detection
    Zhao, Xiaowei
    Liu, Xianglong
    Ma, Yuqing
    Bai, Shihao
    Shen, Yifan
    Hao, Zeyu
    Liu, Aishan
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8267 - 8278
  • [30] Speaker Verification by Partial AUC Optimization With Mahalanobis Distance Metric Learning
    Bai, Zhongxin
    Zhang, Xiao-Lei
    Chen, Jingdong
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 1533 - 1548