A Solution to Query Processing Challenges Through Smart Query Processor for Big Data Analytics

被引：0

作者：

Vaidya G.M. ^{[1
]}

Kshirsagar M.M. ^{[2
]}

机构：

[1] Department of Information Technology, Yeshwantrao Chavan College of Engineering, Nagpur

[2] Department of Computer Technology, Yeshwantrao Chavan College of Engineering, Nagpur

来源：

SN Computer Science | / 4卷 / 2期

关键词：

Big data; Data management; Data pre processing; Data processing; Query optimization;

D O I：

10.1007/s42979-022-01581-4

中图分类号：

学科分类号：

摘要：

Data generation and collection is a continuous process throughout the world. It is going to be difficult to handle various challenges that arise because of various categories of data collected through various sources. The journey of any data analytics work starts with preprocessing. This is the only biggest challenge that takes much time for data separation and categorization. Once this step completed, then by applying several tools, practitioners can process it and can move forward to the next step of centralization, indexing, etc. Many scholars are putting their efforts to get quick responses from the information and hence data analytics task is getting easier. Previous work focused on several challenging issues and provide solutions to issues on preprocessing and data management. Now proposed a big data processing framework Smart Query Processor—SQP which provides the solution to challenges in query processing and processes about 500 GB of data. This paper describes a novel approach using hybrid algorithms and got results in 5X times faster than existing approaches. Finally, compared the results with the previously published work achieved an accuracy of up to 95–96%. In the future, the work will be extend to process several TB of data on highly configured workstations available in the labs. © 2023, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.

引用

共 9 条

[1]

Xinhui T., Shaopeng D., Zhihui D., Wanling G., Rui R., Yaodong C., Zhifei Z., Zhen J., Peijian W., Jianfeng Z., BigDataBench-S: An open-source scientific big data benchmark suite, In: IEEE International Parallel and Distributed Processing Symposium Workshops. Lake Buena Vista, FL, USA: IEEE Xplore, pp. 1068-1077, (2017)

[2]

Charith S., Mahsa S., Mo S., Data science in public mental health: A new analytic framework, Fourth International Workshop on On ICT Solutions for Health, pp. 1122-1128, (2019)

[3]

Congkai B., Meiyang C., Query optimization of massive social network data based on HBase, 4Th IEEE International Conference on Big Data Analytics, pp. 94-97, (2019)

[4]

Dimitrios S., Olivier M., Reducing data complexity in feature extraction and feature selection for big data security analytic, 1St International Conference on Data Intelligence and Security. Nanjing, China: IEEE Xplore, pp. 43-48, (2018)

[5]

Rajanikanth A., Jabbar M.A., Handling data analytics on unstructured data using MongoDB, Smart Cities Symposium

[6]

Rustem D., Salvatore D., Dario B., Francesco L., Giovani M., Antonio P., Data processing in cyber-physical-social systems through edge computing, IEEE access–cyber-physical-social Computing and Networking. Messina

[7]

Meryeme E.H., Maryem R., Bouchra E.A., Hybrid big data warehouse for on-demand decision needs, 3Rd International Conference on Electrical and Information Technologies ICEIT. Rabat

[8]

Jai P.V., Sapan H.M., Sanjay G., Big data analytics: Performance evaluation for high availability and fault tolerance using MapReduce in framework with HDFS, 5Th IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC-2018). Solan, India: IEEE Xplore, pp. 770-774, (2018)

[9]

Meihui S., Derong S., Tiezheng N., Yue K., Ge Y., HPPQ: a parallel package queries processing approach for large-scale data, Big Data Min Anal, 1, 2, pp. 146-159, (2018)

← 1 →