Automatic Segmentation using Knowledge Distillation with Ensemble Models (ASKDEM)

被引:0
作者
Buschiazzo, Anthony [1 ]
Russell, Mason [2 ]
Osteen, Philip [2 ]
Uplinger, James [2 ]
机构
[1] Huntington Ingalls Ind, 8350 Broad St,Suite 1400, Mclean, VA 22102 USA
[2] DEVCOM Army Res Lab, 2800 Powder Mill Rd, Adelphi, MD 20783 USA
来源
UNMANNED SYSTEMS TECHNOLOGY XXVI | 2024年 / 13055卷
关键词
Deep Learning; Computer Vision; Visual Perception; Scene Segmentation; Autonomous Learning; Continuous Learning; Knowledge Distillation; Ensembled Modeling;
D O I
10.1117/12.3013678
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Fielding deep learning artificial intelligence (AI) capabilities in environments that are not well represented within training datasets remains a challenge in deploying reliable perception algorithms. Environments with scarce or no representation within training datasets lend to poor semantic scene segmentation and subsequently result in suboptimal autonomous navigation performance in Unmanned Ground Vehicles (UGVs). This can be attributed to multiple technical variables including intrinsic camera properties, lighting, weather, and seasonal differences, all of which pose significant issues related to a model's ability to generalize to diverse environments and hardware configurations. Recently, zero-shot generalization capabilities for scene segmentation have been demonstrated with pre-trained foundational models. Combining the capabilities of such models with state-of-the-art semantic segmentation models can result in semantic representations of scenes with less label noise and better object boundaries. If accurate semantic labels can be applied to unlabeled segments, the resulting pseudo-labeled semantic segmentation data could be used to re-train an existing semantic segmentation model for new environments. To achieve this goal, we develop an architecture based on ensembles of semantic segmentation models to improve inferencing results in new environments by strengthening pixel label predictions used to classify unlabeled segmentation outputs. The process of automatically generating pseudo-labeled data can be computationally intensive and lacks the speed required for online inference on embodied systems. By utilizing the capabilities of pre-trained segmentation models in conjunction with an ensemble of semantic models, we can rapidly label data collected from a UGV in an environment that our fielded lightweight online model has never seen. Once the data is labeled, the original field model is retrained using the AI pseudo-labeled dataset and evaluated against the original field model. This work explores the possibility of a continuous learning framework that applies an ensemble of models to rapidly label data for model retraining. We present results showing that the approach can lead to improved algorithm performance with practical effect on the capabilities of UGVs relying on AI models which were trained on data from domains outside of the current operating environment. We show that models trained using our approach improved overall mIoU by an average of 4.75% on two distinct datasets and provide qualitative results for a third dataset.
引用
收藏
页数:16
相关论文
共 18 条
  • [1] IDA: Informed Domain Adaptive Semantic Segmentation
    Chen, Zheng
    Ding, Zhengming
    Gregory, Jason M.
    Liu, Lantao
    [J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, IROS, 2023, : 90 - 97
  • [2] Ensemble deep learning: A review
    Ganaie, M. A.
    Hu, Minghui
    Malik, A. K.
    Tanveer, M.
    Suganthan, P. N.
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 115
  • [3] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation
    He, Jianzhong
    Jia, Xu
    Chen, Shuaijun
    Liu, Jianzhuang
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 11003 - 11012
  • [4] RELLIS-3D Dataset: Data, Benchmarks and Analysis
    Jiang, Peng
    Osteen, Philip
    Wigness, Maggie
    Saripalli, Srikanth
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 1110 - 1116
  • [5] Segment Anything
    Kirillov, Alexander
    Mintun, Eric
    Ravi, Nikhila
    Mao, Hanzi
    Rolland, Chloe
    Gustafson, Laura
    Xiao, Tete
    Whitehead, Spencer
    Berg, Alexander C.
    Lo, Wan-Yen
    Dolla'r, Piotr
    Girshick, Ross
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3992 - 4003
  • [6] Lee D.-H., 2013, ICML 2013 WORKSH CHA
  • [7] Li H., 2018, ARXIV180510180, V1805, P10180
  • [8] Li HC, 2018, Arxiv, DOI [arXiv:1805.10180, DOI 10.48550/ARXIV.1805.10180]
  • [9] Feature Pyramid Networks for Object Detection
    Lin, Tsung-Yi
    Dollar, Piotr
    Girshick, Ross
    He, Kaiming
    Hariharan, Bharath
    Belongie, Serge
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 936 - 944
  • [10] Multi-Source Soft Pseudo-Label Learning with Domain Similarity-based Weighting for Semantic Segmentation
    Matsuzaki, Shigemichi
    Masuzawa, Hiroaki
    Miura, Jun
    [J]. 2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 5852 - 5857