Efficient vision-based multi-target augmented reality in the browser

被引：0

作者：

Mohammed A. Al-Zoube

机构：

[1] Princess Sumaya University for Technology (PSUT),Department of Computer Graphics

来源：

Multimedia Tools and Applications | 2022年 / 81卷

关键词：

Augmented reality; Web AR; Pose estimation; MobileNets; WebAssembly; Deep learning; Cross-platform;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Augmented Reality (AR) has gained rising attention from both industry and academia as it enhances the way we interact with the physical world. Compared with native AR apps, implementing AR with web technologies (Web AR) can provide lightweight and universal cross-platform deployment that does not involve extra downloading and installation in advance. However, there are some challenges when developing Web AR apps, such as computational efficiency and networking. The limited capabilities of the browser, especially on mobile devices, make it more challenging to develop efficient web apps. Fortunately, several technical advances have emerged that could change the status of Web AR. This paper presents an efficient implementation of a vision-based and multi-target Web AR app that runs at real-time frame rates on standard web browsers on mobile devices and PCs. A method based on natural features tracking (NFT) is used, and several new web technologies are optimized to achieve specific tasks. The proposed implementation takes advantage of an efficient and lightweight class of convolutional neural networks (CNN) to classify image targets. It uses an image registration method that eliminates the need for a database of the feature points’ descriptors, which is usually used in natural feature tracking methods. Computation-intensive tasks, such as target extraction and pose estimation, were computed with separate threads. Thus, the main thread which handles the HTML rendering runs smoothly and is not blocked by these computation-intensive tasks. To evaluate the performance of the proposed architecture and validate its performance, a prototype app was developed. The findings demonstrate that the app can track multiple image targets with real-time frame rates and stable interaction.

引用

页码：14303 / 14320

页数：17

共 17 条

[1]

Garrido-Jurado S(2014)Automatic generation and detection of highly reliable fiducial markers under occlusion Pattern Recogn 47 2280-2292

[2]

Muñoz-Salinas R(2009)Epnp: An accurate o (n) solution to the pnp problem Int J Comput Vis 81 155-110

[3]

Madrid-Cuevas FJ(2004)Distinctive image features from scale-invariant keypoints Int J Comput Vision 60 91-2651

[4]

Marín-Jiménez MJ(2015)Pose estimation for augmented reality: a hands-on survey IEEE Trans Vis Comput Graph 22 2633-511

[5]

Lepetit V(2018)Technical perspective: WebAssembly: A quiet revolution of the Web Commun ACM 61 106-2030

[6]

Moreno-Noguer F(1996)Iterative pose estimation using coplanar feature points Comput Vis Image Underst 63 495-undefined

[7]

Fua P(2006)Robust pose estimation from a planar target IEEE Trans Pattern Anal Mach Intell 28 2024-undefined

[8]

Lowe DG(undefined)undefined undefined undefined undefined-undefined

[9]

Marchand E(undefined)undefined undefined undefined undefined-undefined

[10]

Uchiyama H(undefined)undefined undefined undefined undefined-undefined

← 1 2 →