Supervised Virtual-to-Real Domain Adaptation for Object Detection Task using YOLO

被引：0

作者：

Nugraha, Akbar Satya ^{[1
]}

Yudistira, Novanto ^{[1
]}

Rahayudi, Bayu ^{[1
]}

机构：

[1] Brawijaya Univ, Fac Comp Sci, Dept Informat Engn, Malang, Indonesia

来源：

2024 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI 2024 | 2024年

关键词：

YOLOv4; Object Detection; Virtual Dataset; Domain Adaptation; Personal Protective Equipment;

D O I：

10.1109/CAI59869.2024.00242

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Deep neural network shows excellent use in a lot of real-world tasks. One of the deep learning tasks is object detection. Well-annotated datasets will affect deep neural network accuracy. More data learned by deep neural networks will make the model more accurate. However, a well-annotated dataset is hard to find, especially in a specific domain. To overcome this, computer-generated data or virtual datasets are used. Researchers could generate many images with specific use cases also with its annotation. Research studies showed that virtual datasets could be used for object detection tasks. Nevertheless, with the usage of the virtual dataset, the model must adapt to real datasets, or the model must have domain adaptability features. We explored the domain adaptation inside the object detection model using a virtual dataset to overcome a few well-annotated datasets. We use VW-PPE dataset, using 5000 and 10000 virtual data and 220 real data. For model architecture, we used YOLOv4 using CSPDarknet53 as the backbone and PAN as the neck. The domain adaptation technique with fine-tuning only on backbone weight achieved a mean average precision of 74.457.

引用

页码：1359 / 1364

页数：6

共 22 条

[1]

Aubry Mathieu, Understanding deep features with computer-generated imagery

[2]

Benedikt M, 2019, BLACK BOX ARTICULATI, P1, DOI DOI 10.1109/cbmi.2019.8877466

[3]

Bochinski E, 2016, 2016 13TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), P278, DOI 10.1109/AVSS.2016.7738056

[4]

Bochkovskiy A, 2020, Arxiv, DOI [arXiv:2004.10934, DOI 10.48550/ARXIV.2004.10934, 10.48550/arXiv.2004.10934]

[5]

Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848

[6] The PASCAL Visual Object Classes Challenge: A Retrospective [J].

Everingham, Mark ;

Eslami, S. M. Ali ;

Van Gool, Luc ;

Williams, Christopher K. I. ;

Winn, John ;

Zisserman, Andrew .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 111 (01) :98-136

[7]

Fabbri Matteo, 2018, Learning to detect and track visible and occluded body joints in a virtual world

[8]

Johnson-Roberson Matthew, 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA), P746, DOI 10.1109/ICRA.2017.7989092

[9]

Joseph RK, 2016, CRIT POL ECON S ASIA, P1

[10] The Open Images Dataset V4 Unified Image Classification, Object Detection, and Visual Relationship Detection at Scale [J].

Kuznetsova, Alina ;

Rom, Hassan ;

Alldrin, Neil ;

Uijlings, Jasper ;

Krasin, Ivan ;

Pont-Tuset, Jordi ;

Kamali, Shahab ;

Popov, Stefan ;

Malloci, Matteo ;

Kolesnikov, Alexander ;

Duerig, Tom ;

Ferrari, Vittorio .

INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (07) :1956-1981

← 1 2 3 →