Video object segmentation for automatic image annotation of ethernet connectors with environment mapping and 3D projection

被引：0

作者：

Danta, Marrone ^{[1
]}

Dreyer, Pedro ^{[1
]}

Bezerra, Daniel ^{[1
]}

Reis, Gabriel ^{[1
]}

Souza, Ricardo ^{[2
]}

Lins, Silvia ^{[2
]}

Kelner, Judith ^{[1
]}

Sadok, Djamel ^{[1
]}

机构：

[1] Univ Fed Pernambuco, Ctr Informat, Grp Pesquisa Redes & Telecomunicacao, Recife, PE, Brazil

[2] Ericsson Res, Indaiatuba, SP, Brazil

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2022年 / 81卷 / 28期

关键词：

RJ45; Automatic annotation; Object tracking; 3D projection; Video object segmentation;

D O I：

10.1007/s11042-022-13128-z

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The creation of a dataset is time-consuming and sometimes discourages researchers from pursuing their goals. To overcome this problem, we present and discuss two solutions adopted for the automation of this process. Both optimize valuable user time and resources and use video object segmentation with object tracking and 3D projection. In our scenario, we acquire images from a moving robotic arm and, for each approach, generate distinct annotated datasets. We evaluated the precision of the annotations by comparing these with a manually annotated dataset. As a complementary test to assess the quality of the generated datasets and to achieve a generalization of our contribution, we tested detection and classification problems. In both tests, we rely on solutions with Convolution Neural Network and Deep Learning. For detection support, we used YOLO and obtained for the projection dataset an F1-Score, accuracy, and mAP values of 0.846, 0.924, and 0.875, respectively. Concerning the tracking dataset, we achieved an F1-Score of 0.861, an accuracy of 0.932, whereas mAP reached 0.894. For the classification, we adopted the two metrics accuracy and F1-Score, and used the known networks VGG, DenseNet, MobileNet, Inception, and ResNet. The VGG architecture outperformed the others for both projection and tracking datasets. It reached an accuracy and F1-score of 0.997 and 0.993, respectively. Similarly, for the tracking dataset, it achieved an accuracy of 0.991 and an F1-Score of 0.981.

引用

页码：39891 / 39913

页数：23

共 23 条

[1] Video object segmentation for automatic image annotation of ethernet connectors with environment mapping and 3D projection
Marrone Danta
Pedro Dreyer
Daniel Bezerra
Gabriel Reis
Ricardo Souza
Silvia Lins
Judith Kelner
Djamel Sadok
Multimedia Tools and Applications, 2022, 81 : 39891 - 39913
[2] Video Object Segmentation with 3D Convolution Network
Tang, Huiyun
Tao, Pin
Ma, Rui
Shi, Yuanchun
ICCCV 2019: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON CONTROL AND COMPUTER VISION, 2019, : 28 - 32
[3] Semi-automatic video object segmentation using seeded region merging and bidirectional projection
Liu, Z
Yang, J
Peng, NS
PATTERN RECOGNITION LETTERS, 2005, 26 (05) : 653 - 662
[4] A Semi-Automatic 2D to Stereoscopic 3D Image and Video Conversion System in a Semi-Automated Segmentation Perspective
Phan, Raymond
Androutsos, Dimitrios
STEREOSCOPIC DISPLAYS AND APPLICATIONS XXIV, 2013, 8648
[5] Automatic 3D object pose estimation in IR image sequences for forward motion applications
Jäger, K
Hebel, M
Bers, K
AUTOMATIC TARGET RECOGNITION XIV, 2004, 5426 : 37 - 45
[6] Integrated Object Segmentation and Tracking for 3D LIDAR Data
Tuncer, Mehmet Ali Cagri
Schulz, Dirk
ICINCO: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL 2, 2016, : 344 - 351
[7] Spatio-Temporal Video Object Segmentation via Scale-Adaptive 3D Structure Tensor
Hai-Yun Wang
Kai-Kuang Ma
EURASIP Journal on Advances in Signal Processing, 2004
[8] Video-object segmentation and 3D-trajectory estimation for monocular video sequences
Xu, Feng
Lam, Kin-Man
Dai, Qionghai
IMAGE AND VISION COMPUTING, 2011, 29 (2-3) : 190 - 205
[9] Spatio-temporal video object segmentation via scale-adaptive 3D structure tensor
Wang, HY
Ma, KK
EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (06) : 798 - 813
[10] Temporal stabilization of video object segmentation for 3D-TV applications
Erdem, CE
Ernst, F
Redert, A
Hendriks, E
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2005, 20 (02) : 151 - 167

← 1 2 3 →