Best Practices in Active Learning for Semantic Segmentation

被引:1
作者
Mittal, Sudhanshu [1 ]
Niemeijer, Joshua [2 ]
Schaefer, Joerg P. [2 ]
Brox, Thomas [1 ]
机构
[1] Univ Freiburg, Freiburg, Germany
[2] German Aerosp Ctr DLR, Braunschweig, Germany
来源
PATTERN RECOGNITION, DAGM GCPR 2023 | 2024年 / 14264卷
关键词
Active Learning; Semantic Segmentation;
D O I
10.1007/978-3-031-54605-1_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Active learning is particularly of interest for semantic segmentation, where annotations are costly. Previous academic studies focused on datasets that are already very diverse and where the model is trained in a supervised manner with a large annotation budget. In contrast, data collected in many driving scenarios is highly redundant, and most medical applications are subject to very constrained annotation budgets. This work investigates the various types of existing active learning methods for semantic segmentation under diverse conditions across three dimensions - data distribution w.r.t. different redundancy levels, integration of semi-supervised learning, and different labeling budgets. We find that these three underlying factors are decisive for the selection of the best active learning approach. As an outcome of our study, we provide a comprehensive usage guide to obtain the best performance for each case. It is the first systematic study that investigates these dimensions covering a wide range of settings including more than 3K model training runs. In this work, we also propose an exemplary evaluation task for driving scenarios, where data has high redundancy, to showcase the practical implications of our research findings.
引用
收藏
页码:427 / 442
页数:16
相关论文
共 41 条
  • [1] Semantic object classes in video: A high-definition ground truth database
    Brostow, Gabriel J.
    Fauqueur, Julien
    Cipolla, Roberto
    [J]. PATTERN RECOGNITION LETTERS, 2009, 30 (02) : 88 - 97
  • [2] Revisiting Superpixels for Active Learning in Semantic Segmentation with Realistic Annotation Costs
    Cai, Lile
    Xu, Xun
    Liew, Jun Hao
    Foo, Chuan Sheng
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10983 - 10992
  • [3] Chapelle O., 2006, IEEE Transactions on Neural Networks
  • [4] Chen LC, 2018, Arxiv, DOI [arXiv:1802.02611, 10.48550/arXiv.1802.02611, DOI 10.48550/ARXIV.1802.02611]
  • [5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [6] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [8] Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models
    Drugman, Thomas
    Pylkkonen, Janne
    Kneser, Reinhard
    [J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2318 - 2322
  • [9] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
  • [10] Fundamental Technologies in Modern Speech Recognition
    Furui, Sadaoki
    Deng, Li
    Gales, Mark
    Ney, Hermann
    Tokuda, Keiichi
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 16 - 17