Normalized Storage Model Construction and Query Optimization of Book Multi-Source Heterogeneous Massive Data

被引:3
作者
Wang, Dailin [1 ]
Liu, Lina [2 ]
Liu, Yali [1 ]
机构
[1] Northeast Forestry Univ, Lib, Harbin 150040, Heilongjiang, Peoples R China
[2] Northeast Forestry Univ, Coll Comp & Control Engn, Harbin 150040, Heilongjiang, Peoples R China
基金
中国国家自然科学基金;
关键词
Data mining; Feature extraction; Hidden Markov models; Information retrieval; Web pages; Data models; Metaverse; Distributed management; Query processing; Heterogeneous information; multi-source book data; extraction model; HBase distributed storage; query optimization;
D O I
10.1109/ACCESS.2023.3301134
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
According to the characteristics of massive, multi-source, heterogeneous, and rapid growth of book literature data information from the perspective of the metaverse, in order to meet the requirements of efficient management and rapid retrieval such as standardized storage, effective extraction, and scientific library construction for unstructured massive and heterogeneous book in-formation, this study focuses on the normalization of multi-source heterogeneous massive book data, the construction of a warehouse model for book data in the metaverse perspective, and the query and optimization of book data. Systematic research and implementation were conducted to solve the problem of how to process, manage, and query multi-source heterogeneous massive book data in the metaverse, improving the utilization value and query efficiency of the data. This study utilized the semi-structured features of book text data to construct an extraction rule model for heterogeneous book data, and effectively extracted massive heterogeneous book information. Based on the HBase distributed storage structure and parallel computing technology, the storage scheme has been optimized and query efficiency has been improved to ensure efficient management and retrieval of massive heterogeneous book data. The experimental results show that compared with traditional methods, there are significant improvements in multiple aspects such as the accuracy and recall rate of book text data extraction, the management methods and query efficiency of book information.
引用
收藏
页码:96543 / 96553
页数:11
相关论文
empty
未找到相关数据