An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique

被引:13
作者
Kumar, Deepak [1 ]
Jha, Vijay Kumar [1 ]
机构
[1] Birla Inst Technol Mesra, Dept Comp Sci & Engn, Ranchi, Bihar, India
关键词
Secure Hash Algorithm (SHA-512); Hadoop Distributed File System (HDFS); Normalized K-Means (NKM) algorithm; Ant Colony Optimization-Genetic Algorithm (ACO-GA); ENERGY EFFICIENCY; MAPREDUCE; PERFORMANCE; HADOOP;
D O I
10.1007/s10619-020-07285-z
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Storing as well as retrieving the data on a specific time frame is fundamental for any application today. So an efficiently designed query permits the user to get results in the desired time and creates credibility for the corresponding application. To avoid the difficulty in query optimization, this paper proposed an improved query optimization process in big data (BD) using the ACO-GA algorithm and HDFS map-reduce. The proposed methodology consists of '2' phases, namely, BD arrangement and query optimization phases. In the first phase, the input data is pre-processed by finding the hash value (HV) using the SHA-512 algorithm and the removal of repeated data using the HDFS map-reduce function. Then, features such as closed frequent pattern, support, and confidence are extracted. Next, the support and confidence are managed by using the entropy calculation. Centered on the entropy calculation, the related information is grouped by using Normalized K-Means (NKM) algorithm. In the 2nd phase, the BD queries are collected, and then the same features are extorted. Next, the optimized query is found by utilizing the ACO-GA algorithm. Finally, the similarity assessment process is performed. The experimental outcomes illustrate that the algorithm outperformed other existent algorithms.
引用
收藏
页码:79 / 96
页数:18
相关论文
共 26 条
[1]  
[Anonymous], P 4 ANN S CLOUD COMP, DOI 10.1145/2523616.2523633
[2]  
[Anonymous], 2010, HotCloud
[3]  
Armbrust M, 2015, PROC VLDB ENDOW, V8, P1840
[4]   Spark SQL: Relational Data Processing in Spark [J].
Armbrust, Michael ;
Xin, Reynold S. ;
Lian, Cheng ;
Huai, Yin ;
Liu, Davies ;
Bradley, Joseph K. ;
Meng, Xiangrui ;
Kaftan, Tomer ;
Franklint, Michael J. ;
Ghodsi, Ali ;
Zaharia, Matei .
SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, :1383-1394
[5]  
Bao CK, 2019, 2019 4TH IEEE INTERNATIONAL CONFERENCE ON BIG DATA ANALYTICS (ICBDA 2019), P94, DOI [10.1109/icbda.2019.8713219, 10.1109/ICBDA.2019.8713219]
[6]   jMetalSP: A framework for dynamic multi-objective big data optimization [J].
Barba-Gonzalez, Cristobal ;
Garcia-Nieto, Jose ;
Nebro, Antonio J. ;
Cordero, Jose A. ;
Durillo, Juan J. ;
Navas-Delgado, Ismael ;
Aldana-Montesa, Jose F. .
APPLIED SOFT COMPUTING, 2018, 69 :737-748
[7]  
Boutin Eric, 2014, Proceedings of the 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI '14). OSDI '14, P285
[8]  
Dwivedi J., 2016, INT J SCI TECHNOL RE, V5
[9]   Hadoop, MapReduce and HDFS: A Developers Perspective [J].
Ghazi, Mohd Rehan ;
Gangodkar, Durgaprasad .
INTERNATIONAL CONFERENCE ON COMPUTER, COMMUNICATION AND CONVERGENCE (ICCC 2015), 2015, 48 :45-50
[10]   SHadoop: Improving MapReduce performance by optimizing job execution mechanism in Hadoop clusters [J].
Gu, Rong ;
Yang, Xiaoliang ;
Yan, Jinshuang ;
Sun, Yuanhao ;
Wang, Bing ;
Yuan, Chunfeng ;
Huang, Yihua .
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2014, 74 (03) :2166-2179