GuideRender: large-scale scene navigation based on multi-modal view frustum movement prediction

被引：26

作者：

Qin, Yiming ^{[1
,3
]}

Chi, Xiaoyu ^{[2
]}

Sheng, Bin ^{[1
]}

Lau, Rynson W. H. ^{[3
]}

机构：

[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China

[2] Beihang Univ, Qingdao Res Inst, Qingdao, Peoples R China

[3] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China

来源：

VISUAL COMPUTER | 2023年 / 39卷 / 08期

基金：

中国国家自然科学基金;

关键词：

Distributed parallel rendering; Multi-modal; View frustum movement prediction; Attentional guidance fusion; GAZE PREDICTION; FRAMEWORK;

D O I：

10.1007/s00371-023-02922-x

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Distributed parallel rendering provides a valuable way to navigate large-scale scenes. However, previous works typically focused on outputting ultra-high-resolution images. In this paper, we target on improving the interactivity of navigation and propose a large-scale scene navigation method, GuideRender, based on multi-modal view frustum movement prediction. Given previous frames, user inputs and object information, GuideRender first extracts frames, user inputs and objects features spatially and temporally using the multi-modal extractor. To obtain effective fused features for prediction, we introduce an attentional guidance fusion module to fuse these features of different domains with attention. Finally, we predict the movement of the view frustum based on the attentional fused features and obtain its future state for loading data in advance to reduce latency. In addition, to facilitate GuideRender, we design an object hierarchy hybrid tree for scene management based on the object distribution and hierarchy, and an adaptive virtual sub-frustum decomposition method based on the relationship between the rendering cost and the rendering node capacity for task decomposition. Experimental results show that GuideRender outperforms baselines in navigating large-scale scenes. We also conduct a user study to show that our method satisfies the navigation requirements in large-scale scenes.

引用

页码：3597 / 3607

页数：11

共 50 条

[1] GuideRender: large-scale scene navigation based on multi-modal view frustum movement prediction
Yiming Qin
Xiaoyu Chi
Bin Sheng
Rynson W. H. Lau
The Visual Computer, 2023, 39 : 3597 - 3607
[2] Large-scale Multi-modal Search and QA at Alibaba
Jin, Rong
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 8 - 8
[3] MMpedia: A Large-Scale Multi-modal Knowledge Graph
Wu, Yinan
Wu, Xiaowei
Li, Junwen
Zhang, Yue
Wang, Haofen
Du, Wen
He, Zhidong
Liu, Jingping
Ruan, Tong
SEMANTIC WEB, ISWC 2023, PT II, 2023, 14266 : 18 - 37
[4] Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph
Wang, Meng
Wang, Haofen
Qi, Guilin
Zheng, Qiushuo
BIG DATA RESEARCH, 2020, 22 (22)
[5] DEPTH ESTIMATION OF MULTI-MODAL SCENE BASED ON MULTI-SCALE MODULATION
Wang, Anjie
Fang, Zhijun
Jiang, Xiaoyan
Gao, Yongbin
Cao, Gaofeng
Ma, Siwei
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2795 - 2799
[6] Exploring Multi-Scenario Multi-Modal CTR Prediction with a Large Scale Dataset
Huan, Zhaoxin
Ding, Ke
Li, Ang
Zhang, Xiaolu
Min, Xu
He, Yong
Zhang, Liang
Zhou, Jun
Mo, Linjian
Gu, Jinjie
Liu, Zhongyi
Zhong, Wenliang
Zhang, Guannan
Li, Chenliang
Yuan, Fajie
PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 1232 - 1241
[7] REINFORCE: rapid augmentation of large-scale multi-modal transport networks for resilience enhancement
Henry, Elise
Furno, Angelo
El Faouzi, Nour-Eddin
APPLIED NETWORK SCIENCE, 2021, 6 (01)
[8] IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments
Soliman, Abanob
Bonardi, Fabien
Sidibe, Desire
Bouchafa, Samia
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2022, 106 (03)
[9] IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments
Abanob Soliman
Fabien Bonardi
Désiré Sidibé
Samia Bouchafa
Journal of Intelligent & Robotic Systems, 2022, 106
[10] WenLan: Efficient Large-Scale Multi-Modal Pre-Training on Real World Data
Song, Ruihua
MMPT '21: PROCEEDINGS OF THE 2021 WORKSHOP ON MULTI-MODAL PRE-TRAINING FOR MULTIMEDIA UNDERSTANDING, 2021, : 3 - 3

← 1 2 3 4 5 →