MonoTTA: Fully Test-Time Adaptation for Monocular 3D Object Detection

被引:0
作者
Lin, Hongbin [1 ,2 ]
Zhang, Yifan [3 ,4 ]
Niu, Shuaicheng [5 ]
Cui, Shuguang [1 ,2 ]
Li, Zhen [1 ,2 ]
机构
[1] FNii Shenzhen, Shenzhen, Peoples R China
[2] CUHK Shenzhen, SSE, Shenzhen, Peoples R China
[3] NUS, Singapore, Singapore
[4] Skywork AI, Singapore, Singapore
[5] Nanyang Technol Univ, Singapore, Singapore
来源
COMPUTER VISION-ECCV 2024, PT XLIV | 2025年 / 15102卷
关键词
Test-time Adaptation; Monocular 3D Object Detection; UNSUPERVISED DOMAIN ADAPTATION;
D O I
10.1007/978-3-031-72784-9_6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monocular 3D object detection (Mono 3Det) aims to identify 3D objects from a single RGB image. However, existing methods often assume training and test data follow the same distribution, which may not hold in real-world test scenarios. To address the out-of-distribution (OOD) problems, we explore a new adaptation paradigm for Mono 3Det, termed Fully Test-time Adaptation which aims to adapt a well-trained model to unlabeled test data by handling potential data distribution shifts at test time. However, applying this paradigm in Mono 3Det poses significant challenges due to OOD test data causing a remarkable decline in object detection scores. This decline conflicts with the pre-defined score thresholds of existing detection methods, leading to severe object omissions (i.e., rare positive detections and many false negatives). Consequently, the limited positive detection and plenty of noisy predictions cause test-time adaptation to fail in Mono 3Det. To handle this problem, we propose a novel Monocular Test-Time Adaptation (MonoTTA) method, based on two new strategies. 1) Reliability-driven adaptation: we empirically find that high-score objects are still reliable and the optimization of high-score objects can enhance confidence across all detections. Thus, we devise a self-adaptive strategy to identify reliable objects for model adaptation, which discovers potential objects and alleviates omissions. 2) Noise-guard adaptation: since high-score objects may be scarce, we develop a negative regularization term to exploit the numerous low-score objects via negative learning, preventing overfitting to noise and trivial solutions. Experimental results show that MonoTTA brings significant performance gains for Mono 3Det models in OOD test scenarios, approximately 190% gains by average on KITTI and 198% gains on nuScenes. The source code is now available at Hongbin98/MonoTTA.
引用
收藏
页码:96 / 114
页数:19
相关论文
共 49 条
[1]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[2]   Monocular 3D Object Detection for Autonomous Driving [J].
Chen, Xiaozhi ;
Kundu, Kaustav ;
Zhang, Ziyu ;
Ma, Huimin ;
Fidler, Sanja ;
Urtasun, Raquel .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2147-2156
[3]  
Chen XZ, 2015, ADV NEUR IN, V28
[4]   MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships [J].
Chen, Yongjian ;
Tai, Lei ;
Sun, Kai ;
Li, Mingyang .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :12090-12099
[5]   VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking [J].
Chen, Yukang ;
Liu, Jianhui ;
Zhang, Xiangyu ;
Qi, Xiaojuan ;
Jia, Jiaya .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :21674-21683
[6]   Learning Depth-Guided Convolutions for Monocular 3D Object Detection [J].
Ding, Mingyu ;
Huo, Yuqi ;
Yi, Hongwei ;
Wang, Zhe ;
Shi, Jianping ;
Lu, Zhiwu ;
Luo, Ping .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11669-11678
[7]   CenterNet: Keypoint Triplets for Object Detection [J].
Duan, Kaiwen ;
Bai, Song ;
Xie, Lingxi ;
Qi, Honggang ;
Huang, Qingming ;
Tian, Qi .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :6568-6577
[8]  
Fleuret F., 2021, NEURIPS 2021 WORKSH
[9]  
Geiger A, 2012, PROC CVPR IEEE, P3354, DOI 10.1109/CVPR.2012.6248074
[10]   Source-free Unsupervised Domain Adaptation for 3D Object Detection in Adverse Weather [J].
Hegde, Deepti ;
Kilic, Velat ;
Sindagi, Vishwanath ;
Cooper, A. Brinton ;
Foster, Mark ;
Patel, Vishal M. .
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, :6973-6980