Automatic Segmentation using Knowledge Distillation with Ensemble Models (ASKDEM)

被引：0

作者：

Buschiazzo, Anthony ^{[1
]}

Russell, Mason ^{[2
]}

Osteen, Philip ^{[2
]}

Uplinger, James ^{[2
]}

机构：

[1] Huntington Ingalls Ind, 8350 Broad St,Suite 1400, Mclean, VA 22102 USA

[2] DEVCOM Army Res Lab, 2800 Powder Mill Rd, Adelphi, MD 20783 USA

来源：

UNMANNED SYSTEMS TECHNOLOGY XXVI | 2024年 / 13055卷

关键词：

Deep Learning; Computer Vision; Visual Perception; Scene Segmentation; Autonomous Learning; Continuous Learning; Knowledge Distillation; Ensembled Modeling;

D O I：

10.1117/12.3013678

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Fielding deep learning artificial intelligence (AI) capabilities in environments that are not well represented within training datasets remains a challenge in deploying reliable perception algorithms. Environments with scarce or no representation within training datasets lend to poor semantic scene segmentation and subsequently result in suboptimal autonomous navigation performance in Unmanned Ground Vehicles (UGVs). This can be attributed to multiple technical variables including intrinsic camera properties, lighting, weather, and seasonal differences, all of which pose significant issues related to a model's ability to generalize to diverse environments and hardware configurations. Recently, zero-shot generalization capabilities for scene segmentation have been demonstrated with pre-trained foundational models. Combining the capabilities of such models with state-of-the-art semantic segmentation models can result in semantic representations of scenes with less label noise and better object boundaries. If accurate semantic labels can be applied to unlabeled segments, the resulting pseudo-labeled semantic segmentation data could be used to re-train an existing semantic segmentation model for new environments. To achieve this goal, we develop an architecture based on ensembles of semantic segmentation models to improve inferencing results in new environments by strengthening pixel label predictions used to classify unlabeled segmentation outputs. The process of automatically generating pseudo-labeled data can be computationally intensive and lacks the speed required for online inference on embodied systems. By utilizing the capabilities of pre-trained segmentation models in conjunction with an ensemble of semantic models, we can rapidly label data collected from a UGV in an environment that our fielded lightweight online model has never seen. Once the data is labeled, the original field model is retrained using the AI pseudo-labeled dataset and evaluated against the original field model. This work explores the possibility of a continuous learning framework that applies an ensemble of models to rapidly label data for model retraining. We present results showing that the approach can lead to improved algorithm performance with practical effect on the capabilities of UGVs relying on AI models which were trained on data from domains outside of the current operating environment. We show that models trained using our approach improved overall mIoU by an average of 4.75% on two distinct datasets and provide qualitative results for a third dataset.

引用

页数：16

共 18 条

[1] IDA: Informed Domain Adaptive Semantic Segmentation
Chen, Zheng
Ding, Zhengming
Gregory, Jason M.
Liu, Lantao
[J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 90 - 97
[2] Ensemble deep learning: A review
Ganaie, M. A.
Hu, Minghui
Malik, A. K.
Tanveer, M.
Suganthan, P. N.
[J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 115
[3] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation
He, Jianzhong
Jia, Xu
Chen, Shuaijun
Liu, Jianzhuang
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11003 - 11012
[4] RELLIS-3D Dataset: Data, Benchmarks and Analysis
Jiang, Peng
Osteen, Philip
Wigness, Maggie
Saripalli, Srikanth
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 1110 - 1116
[5] Segment Anything
Kirillov, Alexander
Mintun, Eric
Ravi, Nikhila
Mao, Hanzi
Rolland, Chloe
Gustafson, Laura
Xiao, Tete
Whitehead, Spencer
Berg, Alexander C.
Lo, Wan-Yen
Dolla'r, Piotr
Girshick, Ross
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3992 - 4003
[6] Lee D.-H., 2013, ICML 2013 WORKSH CHA
[7] Li H., 2018, ARXIV180510180, V1805, P10180
[8] Li HC, 2018, Arxiv, DOI [arXiv:1805.10180, DOI 10.48550/ARXIV.1805.10180]
[9] Feature Pyramid Networks for Object Detection
Lin, Tsung-Yi
Dollar, Piotr
Girshick, Ross
He, Kaiming
Hariharan, Bharath
Belongie, Serge
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944
[10] Multi-Source Soft Pseudo-Label Learning with Domain Similarity-based Weighting for Semantic Segmentation
Matsuzaki, Shigemichi
Masuzawa, Hiroaki
Miura, Jun
[J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 5852 - 5857

← 1 2 →