Best Practices in Active Learning for Semantic Segmentation

被引：1

作者：

Mittal, Sudhanshu ^{[1
]}

Niemeijer, Joshua ^{[2
]}

Schaefer, Joerg P. ^{[2
]}

Brox, Thomas ^{[1
]}

机构：

[1] Univ Freiburg, Freiburg, Germany

[2] German Aerosp Ctr DLR, Braunschweig, Germany

来源：

PATTERN RECOGNITION, DAGM GCPR 2023 | 2024年 / 14264卷

关键词：

Active Learning; Semantic Segmentation;

D O I：

10.1007/978-3-031-54605-1_28

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Active learning is particularly of interest for semantic segmentation, where annotations are costly. Previous academic studies focused on datasets that are already very diverse and where the model is trained in a supervised manner with a large annotation budget. In contrast, data collected in many driving scenarios is highly redundant, and most medical applications are subject to very constrained annotation budgets. This work investigates the various types of existing active learning methods for semantic segmentation under diverse conditions across three dimensions - data distribution w.r.t. different redundancy levels, integration of semi-supervised learning, and different labeling budgets. We find that these three underlying factors are decisive for the selection of the best active learning approach. As an outcome of our study, we provide a comprehensive usage guide to obtain the best performance for each case. It is the first systematic study that investigates these dimensions covering a wide range of settings including more than 3K model training runs. In this work, we also propose an exemplary evaluation task for driving scenarios, where data has high redundancy, to showcase the practical implications of our research findings.

引用

页码：427 / 442

页数：16

共 41 条

[1] Semantic object classes in video: A high-definition ground truth database
Brostow, Gabriel J.
Fauqueur, Julien
Cipolla, Roberto
[J]. PATTERN RECOGNITION LETTERS, 2009, 30 (02) : 88 - 97
[2] Revisiting Superpixels for Active Learning in Semantic Segmentation with Realistic Annotation Costs
Cai, Lile
Xu, Xun
Liew, Jun Hao
Foo, Chuan Sheng
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 10983 - 10992
[3] Chapelle O., 2006, IEEE Transactions on Neural Networks
[4] Chen LC, 2018, Arxiv, DOI [arXiv:1802.02611, 10.48550/arXiv.1802.02611, DOI 10.48550/ARXIV.1802.02611]
[5] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[6] The Cityscapes Dataset for Semantic Urban Scene Understanding
Cordts, Marius
Omran, Mohamed
Ramos, Sebastian
Rehfeld, Timo
Enzweiler, Markus
Benenson, Rodrigo
Franke, Uwe
Roth, Stefan
Schiele, Bernt
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
[7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[8] Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models
Drugman, Thomas
Pylkkonen, Janne
Kneser, Reinhard
[J]. 17TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2016), VOLS 1-5: UNDERSTANDING SPEECH PROCESSING IN HUMANS AND MACHINES, 2016, : 2318 - 2322
[9] The Pascal Visual Object Classes (VOC) Challenge
Everingham, Mark
Van Gool, Luc
Williams, Christopher K. I.
Winn, John
Zisserman, Andrew
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
[10] Fundamental Technologies in Modern Speech Recognition
Furui, Sadaoki
Deng, Li
Gales, Mark
Ney, Hermann
Tokuda, Keiichi
[J]. IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (06) : 16 - 17

← 1 2 3 4 5 →