Investigating the optimisation of real-world and synthetic object detection training datasets through the consideration of environmental and simulation factors

被引:1
作者
Newman, Callum [1 ]
Petzing, Jon [1 ]
Goh, Yee Mey [1 ]
Justham, Laura [1 ]
机构
[1] Loughborough Univ, Wolfson Sch Mech Elect & Mfg Engn, Loughborough, England
来源
INTELLIGENT SYSTEMS WITH APPLICATIONS | 2022年 / 14卷
基金
英国工程与自然科学研究理事会;
关键词
Computer vision; Object detection; Machine learning; Autonomous detection; Synthetic data; Optimisation;
D O I
10.1016/j.iswa.2022.200079
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Computer vision is used for many industrial applications involving automation, especially those related to efficiency and safety. Computer vision techniques which use machine learning, such as object detectors, need a dataset of images for training and testing. Publicly available datasets or new datasets can be used. However, these datasets rarely consider whether the dataset is leading to optimal performance. Environmental factors, such as lighting and occlusion, will alter the appearance of the images and so images taken under certain condition may have different effects on training. A knowledge gap has formed as to how the test performance of deep neural networks can be improved by considering the effect and interactions of factors where either real or synthetic images are used. The following research illustrates that the different factors can have a significant impact on the test performance and demonstrates a process that can be used on real-world and synthetic images to identify the effect of each factor and discusses how this information may be used to create an optimal training dataset. (c) 2022 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )
引用
收藏
页数:15
相关论文
共 44 条
  • [1] Augmented Reality Meets Computer Vision: Efficient Data Generation for Urban Driving Scenes
    Abu Alhaija, Hassan
    Mustikovela, Siva Karthik
    Mescheder, Lars
    Geiger, Andreas
    Rother, Carsten
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2018, 126 (09) : 961 - 972
  • [2] [Anonymous], Deep Learning Toolbox Documentation
  • [3] [Anonymous], 2020, Unity User Manual 2020.3
  • [4] Blender, 2019, Blender 2.81 Manual
  • [5] Semantic object classes in video: A high-definition ground truth database
    Brostow, Gabriel J.
    Fauqueur, Julien
    Cipolla, Roberto
    [J]. PATTERN RECOGNITION LETTERS, 2009, 30 (02) : 88 - 97
  • [6] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [7] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [8] Cut, Paste and Learn: Surprisingly Easy Synthesis for Instance Detection
    Dwibedi, Debidatta
    Misra, Ishan
    Hebert, Martial
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1310 - 1319
  • [9] Nowruzi FE, 2019, Arxiv, DOI arXiv:1907.07061
  • [10] The PASCAL Visual Object Classes Challenge: A Retrospective
    Everingham, Mark
    Eslami, S. M. Ali
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) : 98 - 136