The inconvenient truth of ground truth errors in automotive datasets and DNN-based detection

被引:0
|
作者
Chan, Pak Hung [1 ]
Li, Boda [1 ]
Baris, Gabriele [1 ]
Sadiq, Qasim [1 ]
Donzella, Valentina [1 ]
机构
[1] Univ Warwick, WMG, Coventry, England
来源
DATA-CENTRIC ENGINEERING | 2024年 / 5卷
基金
“创新英国”项目;
关键词
machine learning; automated vehicles; automotive dataset; labeling; ANNOTATION;
D O I
10.1017/dce.2024.39
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Assisted and automated driving functions will rely on machine learning algorithms, given their ability to cope with real-world variations, e.g. vehicles of different shapes, positions, colors, and so forth. Supervised learning needs annotated datasets, and several automotive datasets are available. However, these datasets are tremendous in volume, and labeling accuracy and quality can vary across different datasets and within dataset frames. Accurate and appropriate ground truth is especially important for automotive, as " incomplete " or " incorrect " learning can negatively impact vehicle safety when these neural networks are deployed. This work investigates the ground truth quality of widely adopted automotive datasets, including a detailed analysis of KITTI MoSeg. According to the identified and classified errors in the annotations of different automotive datasets, this article provides three different criteria collections for producing improved annotations. These criteria are enforceable and applicable to a wide variety of datasets. The three annotations sets are created to (i) remove dubious cases; (ii) annotate to the best of human visual system; and (iii) remove clear erroneous BBs. KITTI MoSeg has been reannotated three times according to the specified criteria, and three state-of-the-art deep neural network object detectors are used to evaluate them. The results clearly show that network performance is affected by ground truth variations, and removing clear errors is beneficial for predicting real-world objects only for some networks. The relabeled datasets still present some cases with " arbitrary " / "- controversial" annotations, and therefore, this work concludes with some guidelines related to dataset annotation, metadata/sublabels, and specific automotive use cases.
引用
收藏
页数:13
相关论文
共 19 条
  • [1] Automotive DNN-Based Object Detection in the Presence of Lens Obstruction and Video Compression
    Baris, Gabriele
    Li, Boda
    Chan, Pak Hung
    Avizzano, Carlo Alberto
    Donzella, Valentina
    IEEE ACCESS, 2025, 13 : 36575 - 36589
  • [2] Study of DNN-Based Ragweed Detection from Drones
    Lechner, Martin
    Steindl, Lukas
    Jantsch, Axel
    EMBEDDED COMPUTER SYSTEMS: ARCHITECTURES, MODELING, AND SIMULATION, SAMOS 2022, 2022, 13511 : 187 - 199
  • [3] Parameterization of Sequence of MFCCs for DNN-based voice disorder detection
    Grzywalski, Tomasz
    Maciaszek, Adam
    Biniakowski, Adam
    Orwat, Jan
    Drgas, Szymon
    Piecuch, Mateusz
    Belluzzo, Riccardo
    Joachimiak, Krzysztof
    Niemiec, Dawid
    Ptaszynski, Jakub
    Szarzynski, Krzysztof
    2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 5247 - 5251
  • [4] Label Consistency-Based Ground Truth Inference for Crowdsourcing
    Li, Jiao
    Jiang, Liangxiao
    Zhang, Wenjun
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [5] ViF-GTAD: A new automotive dataset with ground truth for ADAS/AD development, testing, and validation
    Haas, Sarah
    Solmaz, Selim
    Reckenzaun, Jakob
    Genser, Simon
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2023, 42 (08) : 614 - 630
  • [6] Urdu paraphrase detection: A novel DNN-based implementation using a semi-automatically generated corpus
    Iqbal, Hafiz Rizwan
    Maqsood, Rashad
    Raza, Agha Ali
    Ul Hassan, Saeed
    NATURAL LANGUAGE ENGINEERING, 2024, 30 (02) : 354 - 384
  • [7] A Hybrid Approach for Fog Retrieval Based on a Combination of Satellite and Ground Truth Data
    Egli, Sebastian
    Thies, Boris
    Bendix, Joerg
    REMOTE SENSING, 2018, 10 (04)
  • [8] A New Region-Based Minimal Path Selection Algorithm for Crack Detection and Ground Truth Labeling Exploiting Gabor Filters
    de Leon, Gonzalo
    Fiorentini, Nicholas
    Leandri, Pietro
    Losa, Massimo
    REMOTE SENSING, 2023, 15 (11)
  • [9] Fake news detection models using the largest social media ground-truth dataset (TruthSeeker)
    Khalil M.
    Azzeh M.
    International Journal of Speech Technology, 2024, 27 (02) : 389 - 404
  • [10] Bluetooth-Based Vehicle Counting: Bridging the Gap to Ground-Truth With Machine Learning
    Tayeb, Fatima
    Chihaoui, Hamadi
    Filali, Fethi
    IEEE ACCESS, 2023, 11 : 64600 - 64607