FourStr: When Multi-sensor Fusion Meets Semi-supervised Learning

被引：1

作者：

Xie, Bangquan ^{[1
,2
]}

Yang, Liang ^{[3
]}

Yang, Longming ^{[2
]}

Wei, Ailin ^{[4
]}

Weng, Xiaoxiong ^{[1
]}

Li, Bing ^{[2
]}

机构：

[1] South China Univ Technol, Sch Civil Engn & Transportat, Guangzhou 510641, Peoples R China

[2] Clemson Univ, Int Ctr Automot Res CU ICAR, Dept Automot Engn, Greenville, SC 29607 USA

[3] CUNY City Coll, New York, NY 10031 USA

[4] Clemson Univer, Dept Bioengn, Clemson, SC 29631 USA

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA | 2023年

关键词：

D O I：

10.1109/ICRA48891.2023.10161363

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This research proposes a novel semi-supervised learning framework FourStr (Four-Stream formed by two two-stream models) that focuses on the improvement of fusion and labeling efficiency for 3D multi-sensor detector. FourStr adopts a multi-sensor single-stage detector named adaptive fusion network (AFNet) as the backbone and trains it through the semi-supervision learning (SSL) strategy Stereo Fusion. Note that multi-sensor AFNet and SSL Stereo Fusion can benefit each other. On the one hand, the Four-stream composed of two AFNets naturally provides rich inputs and large models for SSL Stereo Fusion. While other SSL works have to use massive augmentation to obtain rich inputs, and deepen and widen the network for large models. On the other hand, by the novel three fusion stages and Loss Pruning, Stereo Fusion improves the fusion and labeling efficiency for AFNet. Finally, extensive experiments demonstrate that FourStr performs excellently on outdoor dataset (KITTI and Waymo Open Dataset) and indoor dataset (SUN RGB-D), especially for the small contour objects. And compared to the fully-supervised methods, FourStr achieves similar accuracy with only 2% labeled data on KITTI (or with 50% labeled data on SUN RGB-D).

引用

页码：676 / 682

页数：7

共 31 条

[1]

Berthelot D., 2019, arXiv

[2]

Chen T, 2020, Arxiv, DOI arXiv:2002.05709

[3]

Chen T, 2020, Arxiv, DOI arXiv:2006.10029

[4]

Chen XZ, 2017, Arxiv, DOI [arXiv:1611.07759, 10.48550/ARXIV.1611.07759]

[5] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [J].

Dai, Angela ;

Qi, Charles Ruizhongtai ;

Niessner, Matthias .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :6545-6554

[6]

Guan TR, 2021, Arxiv, DOI arXiv:2104.11896

[7]

He CH, 2020, PROC CVPR IEEE, P11870, DOI 10.1109/CVPR42600.2020.01189

[8]

Hinton G. E., 2012, arXiv, DOI 10.48550/arXiv.1207.0580

[9]

Ku JS, 2018, Arxiv, DOI arXiv:1712.02294

[10]

Laine S. M., 2018, US Patent App, Patent No. [15/721,433, 15721433]

← 1 2 3 4 →