Normalized Storage Model Construction and Query Optimization of Book Multi-Source Heterogeneous Massive Data

被引:3
|
作者
Wang, Dailin [1 ]
Liu, Lina [2 ]
Liu, Yali [1 ]
机构
[1] Northeast Forestry Univ, Lib, Harbin 150040, Heilongjiang, Peoples R China
[2] Northeast Forestry Univ, Coll Comp & Control Engn, Harbin 150040, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Data mining; Feature extraction; Hidden Markov models; Information retrieval; Web pages; Data models; Metaverse; Distributed management; Query processing; Heterogeneous information; multi-source book data; extraction model; HBase distributed storage; query optimization;
D O I
10.1109/ACCESS.2023.3301134
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
According to the characteristics of massive, multi-source, heterogeneous, and rapid growth of book literature data information from the perspective of the metaverse, in order to meet the requirements of efficient management and rapid retrieval such as standardized storage, effective extraction, and scientific library construction for unstructured massive and heterogeneous book in-formation, this study focuses on the normalization of multi-source heterogeneous massive book data, the construction of a warehouse model for book data in the metaverse perspective, and the query and optimization of book data. Systematic research and implementation were conducted to solve the problem of how to process, manage, and query multi-source heterogeneous massive book data in the metaverse, improving the utilization value and query efficiency of the data. This study utilized the semi-structured features of book text data to construct an extraction rule model for heterogeneous book data, and effectively extracted massive heterogeneous book information. Based on the HBase distributed storage structure and parallel computing technology, the storage scheme has been optimized and query efficiency has been improved to ensure efficient management and retrieval of massive heterogeneous book data. The experimental results show that compared with traditional methods, there are significant improvements in multiple aspects such as the accuracy and recall rate of book text data extraction, the management methods and query efficiency of book information.
引用
收藏
页码:96543 / 96553
页数:11
相关论文
共 50 条
  • [1] Research on Distributed Storage and Query Optimization of Multi-source Heterogeneous Meteorological Data
    Hu, Xiaodong
    Xu, Huanli
    Jia, Jinfang
    Wang, Xiaoying
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND INTERNET OF THINGS (CCIOT 2018), 2018, : 12 - 18
  • [2] SimbaQL: A Query Language for Multi-source Heterogeneous Data
    Li, Yuepeng
    Shen, Zhihong
    Li, Jianhui
    BIG SCIENTIFIC DATA MANAGEMENT, 2019, 11473 : 275 - 284
  • [3] Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data
    Yang, Xin
    Yan, Qi Jing
    Wu, Mi Xia
    ACTA MATHEMATICA SINICA-ENGLISH SERIES, 2024, 40 (11) : 2751 - 2770
  • [4] Adaptive Distributed Inference for Multi-source Massive Heterogeneous Data
    Xin YANG
    Qi Jing YAN
    Mi Xia WU
    Acta Mathematica Sinica,English Series, 2024, (11) : 2751 - 2770
  • [5] Multi-source heterogeneous data storage methods for omnimedia data space
    Zhuo, Wenbo
    INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2024, 15 (3-4) : 314 - 322
  • [6] Construction of a multi-source heterogeneous hybrid platform for big data
    Wang, Ying
    Liu, Yiding
    Xia, Minna
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2021, 21 (03) : 713 - 722
  • [7] An Integration Model of Multi-Source Heterogeneous Audit Data
    Li Chunqiang
    Chai Weiyan
    Chen Linan
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON ELECTRONIC SCIENCE AND AUTOMATION CONTROL, 2015, 20 : 262 - 266
  • [8] Approximate query approach based on ontology for multi-source and heterogeneous XML data
    School of Electronics and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China
    Hsi An Chiao Tung Ta Hsueh, 2007, 6 (702-706):
  • [9] A Multi-source Heterogeneous Data Storage and Retrieval System for Intelligent Manufacturing
    Kong, Yaning
    Li, Dongmei
    Li, Chunshan
    Chu, Dianhui
    Yao, Zekun
    2021 IEEE INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE 2021), 2021, : 82 - 87
  • [10] Multi-source Heterogeneous Data Fusion
    Zhang, Lili
    Xie, Yuxiang
    Luan Xidao
    Zhang, Xin
    2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND BIG DATA (ICAIBD), 2018, : 47 - 51