GuideRender: large-scale scene navigation based on multi-modal view frustum movement prediction

被引:26
|
作者
Qin, Yiming [1 ,3 ]
Chi, Xiaoyu [2 ]
Sheng, Bin [1 ]
Lau, Rynson W. H. [3 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Comp Sci & Engn, Shanghai, Peoples R China
[2] Beihang Univ, Qingdao Res Inst, Qingdao, Peoples R China
[3] City Univ Hong Kong, Dept Comp Sci, Kowloon, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Distributed parallel rendering; Multi-modal; View frustum movement prediction; Attentional guidance fusion; GAZE PREDICTION; FRAMEWORK;
D O I
10.1007/s00371-023-02922-x
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Distributed parallel rendering provides a valuable way to navigate large-scale scenes. However, previous works typically focused on outputting ultra-high-resolution images. In this paper, we target on improving the interactivity of navigation and propose a large-scale scene navigation method, GuideRender, based on multi-modal view frustum movement prediction. Given previous frames, user inputs and object information, GuideRender first extracts frames, user inputs and objects features spatially and temporally using the multi-modal extractor. To obtain effective fused features for prediction, we introduce an attentional guidance fusion module to fuse these features of different domains with attention. Finally, we predict the movement of the view frustum based on the attentional fused features and obtain its future state for loading data in advance to reduce latency. In addition, to facilitate GuideRender, we design an object hierarchy hybrid tree for scene management based on the object distribution and hierarchy, and an adaptive virtual sub-frustum decomposition method based on the relationship between the rendering cost and the rendering node capacity for task decomposition. Experimental results show that GuideRender outperforms baselines in navigating large-scale scenes. We also conduct a user study to show that our method satisfies the navigation requirements in large-scale scenes.
引用
收藏
页码:3597 / 3607
页数:11
相关论文
共 50 条
  • [1] GuideRender: large-scale scene navigation based on multi-modal view frustum movement prediction
    Yiming Qin
    Xiaoyu Chi
    Bin Sheng
    Rynson W. H. Lau
    The Visual Computer, 2023, 39 : 3597 - 3607
  • [2] Large-scale Multi-modal Search and QA at Alibaba
    Jin, Rong
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 8 - 8
  • [3] MMpedia: A Large-Scale Multi-modal Knowledge Graph
    Wu, Yinan
    Wu, Xiaowei
    Li, Junwen
    Zhang, Yue
    Wang, Haofen
    Du, Wen
    He, Zhidong
    Liu, Jingping
    Ruan, Tong
    SEMANTIC WEB, ISWC 2023, PT II, 2023, 14266 : 18 - 37
  • [4] Richpedia: A Large-Scale, Comprehensive Multi-Modal Knowledge Graph
    Wang, Meng
    Wang, Haofen
    Qi, Guilin
    Zheng, Qiushuo
    BIG DATA RESEARCH, 2020, 22 (22)
  • [5] DEPTH ESTIMATION OF MULTI-MODAL SCENE BASED ON MULTI-SCALE MODULATION
    Wang, Anjie
    Fang, Zhijun
    Jiang, Xiaoyan
    Gao, Yongbin
    Cao, Gaofeng
    Ma, Siwei
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 2795 - 2799
  • [6] Exploring Multi-Scenario Multi-Modal CTR Prediction with a Large Scale Dataset
    Huan, Zhaoxin
    Ding, Ke
    Li, Ang
    Zhang, Xiaolu
    Min, Xu
    He, Yong
    Zhang, Liang
    Zhou, Jun
    Mo, Linjian
    Gu, Jinjie
    Liu, Zhongyi
    Zhong, Wenliang
    Zhang, Guannan
    Li, Chenliang
    Yuan, Fajie
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 1232 - 1241
  • [7] REINFORCE: rapid augmentation of large-scale multi-modal transport networks for resilience enhancement
    Henry, Elise
    Furno, Angelo
    El Faouzi, Nour-Eddin
    APPLIED NETWORK SCIENCE, 2021, 6 (01)
  • [8] IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments
    Soliman, Abanob
    Bonardi, Fabien
    Sidibe, Desire
    Bouchafa, Samia
    JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2022, 106 (03)
  • [9] IBISCape: A Simulated Benchmark for multi-modal SLAM Systems Evaluation in Large-scale Dynamic Environments
    Abanob Soliman
    Fabien Bonardi
    Désiré Sidibé
    Samia Bouchafa
    Journal of Intelligent & Robotic Systems, 2022, 106
  • [10] WenLan: Efficient Large-Scale Multi-Modal Pre-Training on Real World Data
    Song, Ruihua
    MMPT '21: PROCEEDINGS OF THE 2021 WORKSHOP ON MULTI-MODAL PRE-TRAINING FOR MULTIMEDIA UNDERSTANDING, 2021, : 3 - 3