Real-Time Deep Learning-Based Object Recognition in Augmented Reality

被引:0
作者
Egipko, V [1 ]
Zhdanova, M. [1 ,2 ]
Gapon, N. [1 ,2 ]
Voronin, V. [1 ]
Semenishchev, E. [1 ]
机构
[1] Moscow State Univ Technol STANKIN, Ctr Cognit Technol & Machine Vis, Moscow, Russia
[2] Don State Tech Univ, Rostov Na Donu, Russia
来源
REAL-TIME PROCESSING OF IMAGE, DEPTH, AND VIDEO INFORMATION 2024 | 2024年 / 13000卷
基金
俄罗斯科学基金会;
关键词
deep learning; augmented reality; computer vision; object recognition; robotic systems; real-time processing;
D O I
10.1117/12.3024957
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Augmented reality is a visualization technology that displays information by adding virtual images to the real world. Effective implementation of augmented reality requires recognition of the current scene. Identifying objects in real-time video on computationally limited hardware requires significant effort. One way to solve this problem is to create a hybrid system that, based on machine learning and computer vision technology, processes and analyzes visual data to identify and classify real-world objects. The proposed architecture is based on a combination of the Vuforia augmented system, which provides good performance by balancing prediction accuracy and efficiency. First, the Vuforia neural network architecture allows convenient interaction with AR in Unity and provides initial conditions for detecting 3D objects. The augmented reality construction algorithm is based on the ARCore framework and the OpenGL interface for embedded systems. The system integrates recognition data with an AR platform to display corresponding 3D models, allowing users to interact with them through the functionality of the AR application. This method also involves the development of an enhanced user interface for AR, making the augmented environment more accessible for navigation and control. Experimental research has shown that the proposed method significantly improves the accuracy of object recognition and the ease of working with 3D models in AR.
引用
收藏
页数:7
相关论文
共 15 条
  • [1] Chen LC, 2017, Arxiv, DOI arXiv:1706.05587
  • [2] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation
    Chen, Liang-Chieh
    Zhu, Yukun
    Papandreou, George
    Schroff, Florian
    Adam, Hartwig
    [J]. COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 833 - 851
  • [3] Chen LJ, 2018, ADV NEUR IN, V31
  • [4] The Cityscapes Dataset for Semantic Urban Scene Understanding
    Cordts, Marius
    Omran, Mohamed
    Ramos, Sebastian
    Rehfeld, Timo
    Enzweiler, Markus
    Benenson, Rodrigo
    Franke, Uwe
    Roth, Stefan
    Schiele, Bernt
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 3213 - 3223
  • [5] Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
  • [6] Augmenting Phenomenology: Using Augmented Reality to Aid Archaeological Phenomenology in the Landscape
    Eve, Stuart
    [J]. JOURNAL OF ARCHAEOLOGICAL METHOD AND THEORY, 2012, 19 (04) : 582 - 600
  • [7] Howard AG, 2017, Arxiv, DOI arXiv:1704.04861
  • [8] ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design
    Ma, Ningning
    Zhang, Xiangyu
    Zheng, Hai-Tao
    Sun, Jian
    [J]. COMPUTER VISION - ECCV 2018, PT XIV, 2018, 11218 : 122 - 138
  • [9] Paszke A, 2016, Arxiv, DOI [arXiv:1606.02147, 10.48550/arXiv.1606.02147]
  • [10] An adaptive gamma correction for image enhancement
    Rahman, Shanto
    Rahman, Md Mostafijur
    Abdullah-Al-Wadud, M.
    Al-Quaderi, Golam Dastegir
    Shoyaib, Mohammad
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,