Video object segmentation for automatic image annotation of ethernet connectors with environment mapping and 3D projection

被引:0
|
作者
Danta, Marrone [1 ]
Dreyer, Pedro [1 ]
Bezerra, Daniel [1 ]
Reis, Gabriel [1 ]
Souza, Ricardo [2 ]
Lins, Silvia [2 ]
Kelner, Judith [1 ]
Sadok, Djamel [1 ]
机构
[1] Univ Fed Pernambuco, Ctr Informat, Grp Pesquisa Redes & Telecomunicacao, Recife, PE, Brazil
[2] Ericsson Res, Indaiatuba, SP, Brazil
关键词
RJ45; Automatic annotation; Object tracking; 3D projection; Video object segmentation;
D O I
10.1007/s11042-022-13128-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The creation of a dataset is time-consuming and sometimes discourages researchers from pursuing their goals. To overcome this problem, we present and discuss two solutions adopted for the automation of this process. Both optimize valuable user time and resources and use video object segmentation with object tracking and 3D projection. In our scenario, we acquire images from a moving robotic arm and, for each approach, generate distinct annotated datasets. We evaluated the precision of the annotations by comparing these with a manually annotated dataset. As a complementary test to assess the quality of the generated datasets and to achieve a generalization of our contribution, we tested detection and classification problems. In both tests, we rely on solutions with Convolution Neural Network and Deep Learning. For detection support, we used YOLO and obtained for the projection dataset an F1-Score, accuracy, and mAP values of 0.846, 0.924, and 0.875, respectively. Concerning the tracking dataset, we achieved an F1-Score of 0.861, an accuracy of 0.932, whereas mAP reached 0.894. For the classification, we adopted the two metrics accuracy and F1-Score, and used the known networks VGG, DenseNet, MobileNet, Inception, and ResNet. The VGG architecture outperformed the others for both projection and tracking datasets. It reached an accuracy and F1-score of 0.997 and 0.993, respectively. Similarly, for the tracking dataset, it achieved an accuracy of 0.991 and an F1-Score of 0.981.
引用
收藏
页码:39891 / 39913
页数:23
相关论文
共 23 条
  • [1] Video object segmentation for automatic image annotation of ethernet connectors with environment mapping and 3D projection
    Marrone Danta
    Pedro Dreyer
    Daniel Bezerra
    Gabriel Reis
    Ricardo Souza
    Silvia Lins
    Judith Kelner
    Djamel Sadok
    Multimedia Tools and Applications, 2022, 81 : 39891 - 39913
  • [2] Video Object Segmentation with 3D Convolution Network
    Tang, Huiyun
    Tao, Pin
    Ma, Rui
    Shi, Yuanchun
    ICCCV 2019: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON CONTROL AND COMPUTER VISION, 2019, : 28 - 32
  • [3] Semi-automatic video object segmentation using seeded region merging and bidirectional projection
    Liu, Z
    Yang, J
    Peng, NS
    PATTERN RECOGNITION LETTERS, 2005, 26 (05) : 653 - 662
  • [4] A Semi-Automatic 2D to Stereoscopic 3D Image and Video Conversion System in a Semi-Automated Segmentation Perspective
    Phan, Raymond
    Androutsos, Dimitrios
    STEREOSCOPIC DISPLAYS AND APPLICATIONS XXIV, 2013, 8648
  • [5] Automatic 3D object pose estimation in IR image sequences for forward motion applications
    Jäger, K
    Hebel, M
    Bers, K
    AUTOMATIC TARGET RECOGNITION XIV, 2004, 5426 : 37 - 45
  • [6] Integrated Object Segmentation and Tracking for 3D LIDAR Data
    Tuncer, Mehmet Ali Cagri
    Schulz, Dirk
    ICINCO: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON INFORMATICS IN CONTROL, AUTOMATION AND ROBOTICS, VOL 2, 2016, : 344 - 351
  • [7] Spatio-Temporal Video Object Segmentation via Scale-Adaptive 3D Structure Tensor
    Hai-Yun Wang
    Kai-Kuang Ma
    EURASIP Journal on Advances in Signal Processing, 2004
  • [8] Video-object segmentation and 3D-trajectory estimation for monocular video sequences
    Xu, Feng
    Lam, Kin-Man
    Dai, Qionghai
    IMAGE AND VISION COMPUTING, 2011, 29 (2-3) : 190 - 205
  • [9] Spatio-temporal video object segmentation via scale-adaptive 3D structure tensor
    Wang, HY
    Ma, KK
    EURASIP JOURNAL ON APPLIED SIGNAL PROCESSING, 2004, 2004 (06) : 798 - 813
  • [10] Temporal stabilization of video object segmentation for 3D-TV applications
    Erdem, CE
    Ernst, F
    Redert, A
    Hendriks, E
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2005, 20 (02) : 151 - 167