Binocular Depth Measurement Method for Desktop Interaction Scene

被引：0

作者：

Ye, Bin ^{[1
,2
]}

Zhu, Xingshuai ^{[1
,2
]}

Yao, Kang ^{[1
,2
]}

Ding, Shangshang ^{[1
,2
]}

Fu, Weiwei ^{[1
,2
]}

机构：

[1] Division of Life Sciences and Medicine, School of Biomedical Engineering (Suzhou), University of Science and Technology of China, Jiangsu, Suzhou

[2] Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, Jiangsu, Suzhou

来源：

Computer Engineering and Applications | 2024年 / 60卷 / 09期

关键词：

binocular vision; deep learning; depth measurement; desktop interaction; stereo matching;

D O I：

10.3778/j.issn.1002-8331.2212-0373

中图分类号：

学科分类号：

摘要：

Virtual reality interaction methods based on vision have no specific solution in desktop writing application scene. In order to provide accurate recognition of fine interactive action, a high precision three-dimensional recognition technology based on the combination of hand and pen is needed. Additionally, the depth accuracy is an important factor to the accuracy of three-dimensional recognition. Therefore, a high-precision depth measurement method in this study is provided to use in this paper. The core concept of this method is using high-resolution and close-range image pairs as input for writing interaction, and proposing the idea of cross-fusion of global and local important information to improve speed and accuracy, and reduce computing cost. In the algorithm, the region detection module is used to extract the key areas of the hand and pen tip in the image pair, and then the input is scaled according to the degree of importance. The regional feature pyramid structure is introduced to extract multi-scale semantic information. Meanwhile, disparity cascade module is used to narrow the matching range to improve the real-time performance. Finally, the experiments results confirm that this depth measurement method has high accuracy and good real-time performance in the interactive area between hand and pen tip, and can effectively assist to improve the three-dimensional recognition accuracy in further to provide better writing interactive experience. In summary, this study may provide new understandings and theoretic basis for future prospect of the depth measurement application in writing interaction. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.

引用

页码：283 / 291

页数：8

共 23 条

[1]

HAN S, LIU B, CABEZAS R, Et al., MEgATrack: monochrome egocentric articulated hand-tracking for virtual reality, ACM Transactions on Graphics (ToG), 39, 4, pp. 1-13, (2020)

[2]

LU G N., Virtual reality monocular depth information extraction based on interactive view, Computer Simulation, 37, 12, pp. 382-385, (2020)

[3]

MOON G, YU S I, WEN H, Et al., Interhand2. 6m: a dataset and baseline for 3D interacting hand pose estimation from a single rgb image, European Conference on Computer Vision, pp. 548-564, (2020)

[4]

ZHANG F Y., Research on intelligent vehicle obstacle detection system based on binocular vision, (2019)

[5]

JIA X., Binocular 3D object sparse and dense point cloud reconstruction based on deep learning, (2022)

[6]

SCHARSTEIN D, SZELISKI R., A taxonomy and evaluation of dense two-frame stereo correspondence algorithms, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001), (2001)

[7]

YIN C Y, ZHI H H, LI H B., Survey of binocular stereo-matching methods based on deep learning, Computer Engineering, 48, 10, pp. 1-12, (2022)

[8]

ZBONTAR J, LECUN Y., Stereo matching by training a convolutional neural network to compare image patches, (2015)

[9]

SONG X, ZHAO X, HU H, Et al., Edgestereo: a context integrated residual pyramid network for stereo matching, Asian Conference on Computer Vision, pp. 20-35, (2018)

[10]

CAO Y, XU J, LIN S, Et al., GCNet: non-local networks meet squeeze-excitation networks and beyond, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), (2020)

← 1 2 3 →