The inconvenient truth of ground truth errors in automotive datasets and DNN-based detection

被引：0

作者：

Chan, Pak Hung ^{[1
]}

Li, Boda ^{[1
]}

Baris, Gabriele ^{[1
]}

Sadiq, Qasim ^{[1
]}

Donzella, Valentina ^{[1
]}

机构：

[1] Univ Warwick, WMG, Coventry, England

来源：

DATA-CENTRIC ENGINEERING | 2024年 / 5卷

基金：

“创新英国”项目;

关键词：

machine learning; automated vehicles; automotive dataset; labeling; ANNOTATION;

D O I：

10.1017/dce.2024.39

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Assisted and automated driving functions will rely on machine learning algorithms, given their ability to cope with real-world variations, e.g. vehicles of different shapes, positions, colors, and so forth. Supervised learning needs annotated datasets, and several automotive datasets are available. However, these datasets are tremendous in volume, and labeling accuracy and quality can vary across different datasets and within dataset frames. Accurate and appropriate ground truth is especially important for automotive, as " incomplete " or " incorrect " learning can negatively impact vehicle safety when these neural networks are deployed. This work investigates the ground truth quality of widely adopted automotive datasets, including a detailed analysis of KITTI MoSeg. According to the identified and classified errors in the annotations of different automotive datasets, this article provides three different criteria collections for producing improved annotations. These criteria are enforceable and applicable to a wide variety of datasets. The three annotations sets are created to (i) remove dubious cases; (ii) annotate to the best of human visual system; and (iii) remove clear erroneous BBs. KITTI MoSeg has been reannotated three times according to the specified criteria, and three state-of-the-art deep neural network object detectors are used to evaluate them. The results clearly show that network performance is affected by ground truth variations, and removing clear errors is beneficial for predicting real-world objects only for some networks. The relabeled datasets still present some cases with " arbitrary " / "- controversial" annotations, and therefore, this work concludes with some guidelines related to dataset annotation, metadata/sublabels, and specific automotive use cases.

引用

页数：13

共 19 条

[1] Automotive DNN-Based Object Detection in the Presence of Lens Obstruction and Video Compression
Baris, Gabriele
Li, Boda
Chan, Pak Hung
Avizzano, Carlo Alberto
Donzella, Valentina
IEEE ACCESS, 2025, 13 : 36575 - 36589
[2] Study of DNN-Based Ragweed Detection from Drones
Lechner, Martin
Steindl, Lukas
Jantsch, Axel
EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2022, 2022, 13511 : 187 - 199
[3] Parameterization of Sequence of MFCCs for DNN-based voice disorder detection
Grzywalski, Tomasz
Maciaszek, Adam
Biniakowski, Adam
Orwat, Jan
Drgas, Szymon
Piecuch, Mateusz
Belluzzo, Riccardo
Joachimiak, Krzysztof
Niemiec, Dawid
Ptaszynski, Jakub
Szarzynski, Krzysztof
2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5247 - 5251
[4] Label Consistency-Based Ground Truth Inference for Crowdsourcing
Li, Jiao
Jiang, Liangxiao
Zhang, Wenjun
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
[5] ViF-GTAD: A new automotive dataset with ground truth for ADAS/AD development, testing, and validation
Haas, Sarah
Solmaz, Selim
Reckenzaun, Jakob
Genser, Simon
INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2023, 42 (08) : 614 - 630
[6] Urdu paraphrase detection: A novel DNN-based implementation using a semi-automatically generated corpus
Iqbal, Hafiz Rizwan
Maqsood, Rashad
Raza, Agha Ali
Ul Hassan, Saeed
NATURAL LANGUAGE ENGINEERING, 2024, 30 (02) : 354 - 384
[7] A Hybrid Approach for Fog Retrieval Based on a Combination of Satellite and Ground Truth Data
Egli, Sebastian
Thies, Boris
Bendix, Joerg
REMOTE SENSING, 2018, 10 (04)
[8] A New Region-Based Minimal Path Selection Algorithm for Crack Detection and Ground Truth Labeling Exploiting Gabor Filters
de Leon, Gonzalo
Fiorentini, Nicholas
Leandri, Pietro
Losa, Massimo
REMOTE SENSING, 2023, 15 (11)
[9] Fake news detection models using the largest social media ground-truth dataset (TruthSeeker)
Khalil M.
Azzeh M.
International Journal of Speech Technology, 2024, 27 (02) : 389 - 404
[10] Bluetooth-Based Vehicle Counting: Bridging the Gap to Ground-Truth With Machine Learning
Tayeb, Fatima
Chihaoui, Hamadi
Filali, Fethi
IEEE ACCESS, 2023, 11 : 64600 - 64607

← 1 2 →