Efficient vision-based multi-target augmented reality in the browser

被引:7
作者
Al-Zoube, Mohammed A. [1 ]
机构
[1] Princess Sumaya Univ Technol PSUT, Dept Comp Graph, POB 1438, Amman 11941, Jordan
关键词
Augmented reality; Web AR; Pose estimation; MobileNets; WebAssembly; Deep learning; Cross-platform; POSE ESTIMATION;
D O I
10.1007/s11042-022-12206-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Augmented Reality (AR) has gained rising attention from both industry and academia as it enhances the way we interact with the physical world. Compared with native AR apps, implementing AR with web technologies (Web AR) can provide lightweight and universal cross-platform deployment that does not involve extra downloading and installation in advance. However, there are some challenges when developing Web AR apps, such as computational efficiency and networking. The limited capabilities of the browser, especially on mobile devices, make it more challenging to develop efficient web apps. Fortunately, several technical advances have emerged that could change the status of Web AR. This paper presents an efficient implementation of a vision-based and multi-target Web AR app that runs at real-time frame rates on standard web browsers on mobile devices and PCs. A method based on natural features tracking (NFT) is used, and several new web technologies are optimized to achieve specific tasks. The proposed implementation takes advantage of an efficient and lightweight class of convolutional neural networks (CNN) to classify image targets. It uses an image registration method that eliminates the need for a database of the feature points' descriptors, which is usually used in natural feature tracking methods. Computation-intensive tasks, such as target extraction and pose estimation, were computed with separate threads. Thus, the main thread which handles the HTML rendering runs smoothly and is not blocked by these computation-intensive tasks. To evaluate the performance of the proposed architecture and validate its performance, a prototype app was developed. The findings demonstrate that the app can track multiple image targets with real-time frame rates and stable interaction.
引用
收藏
页码:14303 / 14320
页数:18
相关论文
共 37 条
[21]   Distinctive image features from scale-invariant keypoints [J].
Lowe, DG .
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2004, 60 (02) :91-110
[22]  
Lucas BD, 1981, P INT JOINT C ART IN, V81, P674
[23]   Pose Estimation for Augmented Reality: A Hands-On Survey [J].
Marchand, Eric ;
Uchiyama, Hideaki ;
Spindler, Fabien .
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2016, 22 (12) :2633-2651
[24]   Technical Perspective WebAssembly: A Quiet Revolution of the Web [J].
Moller, Anders .
COMMUNICATIONS OF THE ACM, 2018, 61 (12) :106-106
[25]   Iterative pose estimation using coplanar feature points [J].
Oberkampf, D ;
DeMenthon, DF ;
Davis, LS .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1996, 63 (03) :495-511
[26]  
Petrovic N, 2020, 2020 55TH INTERNATIONAL SCIENTIFIC CONFERENCE ON INFORMATION, COMMUNICATION AND ENERGY SYSTEMS AND TECHNOLOGIES (IEEE ICEST 2020), P33, DOI [10.1109/icest49890.2020.9232713, 10.1109/ICEST49890.2020.9232713]
[27]   Web AR: A Promising Future for Mobile Augmented Reality-State of the Art, Challenges, and Insights [J].
Qiao, Xiuquan ;
Ren, Pei ;
Dustdar, Schahram ;
Liu, Ling ;
Ma, Huadong ;
Chen, Junliang .
PROCEEDINGS OF THE IEEE, 2019, 107 (04) :651-666
[28]   A Mobile Outdoor Augmented Reality Method Combining Deep Learning Object Detection and Spatial Relationships for Geovisualization [J].
Rao, Jinmeng ;
Qiao, Yanjun ;
Ren, Fu ;
Wang, Junxing ;
Du, Qingyun .
SENSORS, 2017, 17 (09)
[29]   Machine learning for high-speed corner detection [J].
Rosten, Edward ;
Drummond, Tom .
COMPUTER VISION - ECCV 2006 , PT 1, PROCEEDINGS, 2006, 3951 :430-443
[30]  
Rublee E, 2011, IEEE I CONF COMP VIS, P2564, DOI 10.1109/ICCV.2011.6126544