Boxy Vehicle Detection in Large Images

被引:15
作者
Behrendt, Karsten [1 ]
机构
[1] Bosch Automated Driving, Stuttgart, Germany
来源
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW) | 2019年
关键词
D O I
10.1109/ICCVW.2019.00112
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Camera-based object detection and automated driving in general have greatly improved over the last few years. Parts of these improvements can be attributed to public datasets which allow researchers around the world to work with data that would often be too expensive to collect and annotate for individual teams. Current vehicle detection datasets and approaches often focus on axis-aligned bounding boxes or semantic segmentation. Axis-aligned bounding boxes often misrepresent vehicle sizes and may intrude into neighboring lanes. While pixel level segmentations are more accurate, they can be hard to process and leverage for trajectory planning systems. We therefore present the Boxy dataset for image-based vehicle detection. Boxy is one of the largest public vehicle detection datasets with 1.99 million annotated vehicles in 200,000 images, including sunny, rainy, and nighttime driving. If possible, vehicle annotations are split into their visible sides to give the impression of 3D boxes for a more accurate representation with little overhead. Five megapixel images with annotations down to a few pixels make this dataset especially challenging. With Boxy, we provide initial benchmark challenges for bounding box, polygon, and real-time detections. All benchmarks are open-source so that additional metrics and benchmarks may be added at https://boxy-dataset.com.
引用
收藏
页码:840 / 846
页数:7
相关论文
共 33 条
  • [1] [Anonymous], 2017, IEEE INT C COMP VIS
  • [2] Caraffi C, 2012, IEEE INT C INTELL TR, P975, DOI 10.1109/ITSC.2012.6338748
  • [3] Monocular 3D Object Detection for Autonomous Driving
    Chen, Xiaozhi
    Kundu, Kaustav
    Zhang, Ziyu
    Ma, Huimin
    Fidler, Sanja
    Urtasun, Raquel
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
  • [4] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [5] The Pascal Visual Object Classes (VOC) Challenge
    Everingham, Mark
    Van Gool, Luc
    Williams, Christopher K. I.
    Winn, John
    Zisserman, Andrew
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2010, 88 (02) : 303 - 338
  • [6] Vision meets robotics: The KITTI dataset
    Geiger, A.
    Lenz, P.
    Stiller, C.
    Urtasun, R.
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1231 - 1237
  • [7] Girshick R., 2014, IEEE COMP SOC C COMP, DOI [10.1109/CVPR.2014.81, DOI 10.1109/CVPR.2014.81]
  • [8] Fast R-CNN
    Girshick, Ross
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 1440 - 1448
  • [9] He K., 2016, CVPR, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]
  • [10] Huang J., 2017, CVPR, DOI DOI 10.1109/CVPR.2017.351