Using data mining techniques to improve replica management in cloud environment

被引:0
作者
N. Mansouri
M. M. Javidi
B. Mohammad Hasani Zade
机构
[1] Shahid Bahonar University of Kerman,Department of Computer Science
[2] Shahid Bahonar University of Kerman,Mahani Mathematical Research Center
来源
Soft Computing | 2020年 / 24卷
关键词
Cloud computing; Data replication; Data mining; Simulation;
D O I
暂无
中图分类号
学科分类号
摘要
Effective data management is a crucial problem in distributed systems such as data grid and cloud. This can be achieved by replicating file in a wise manner, which reduces data access time, increases data availability, reliability and system load balancing. Determining a reasonable number and appropriate location of replicas is essential decision in cloud computing. In this paper, a new dynamic replication strategy called Data Mining-based Data Replication (DMDR) is proposed, which determines the correlation of the data files accessed using the file access history. We focus particularly on how extracted knowledge with maximal frequent correlated pattern mining improves data replication. We can group files with high dependency in the same replica set. Through the DMDR strategy, replicas can be stored in the suitable locations, with reduced access latency according to the centrality factor. In addition, due to the finite storage space of each node, replicas that are useful for future tasks can be wastefully deleted and replaced with less beneficial ones. Results of simulation using CloudSim indicate that DMDR strategy has a relative advantage in effective network usage, average response time, hit ratio in comparison with current methods. It can be concluded from this investigation that data mining technique is effective and helpful in the finding of users’ future access behavior in cloud environment.
引用
收藏
页码:7335 / 7360
页数:25
相关论文
共 168 条
  • [1] Abouzeid A(2009)HadoopDB A: an architectural hybrid of MapReduce and DBMS technologies for analytical workloads Proc VLDB Endow 2 922-933
  • [2] Bajda-Pawlikowski K(2018)Forecasting investment and consumption behavior of economic agents through dynamic computable general equilibrium model Financ Innov 4 7-9081
  • [3] Abadi D(2019)A cognitive/intelligent resource provisioning for cloud computing services: opportunities and challenges Soft Comput 32 9069-127
  • [4] Silberschatz A(2001)Genomes online database (GOLD): a monitor of genome projects world-wide Nucl Acids Res 29 126-50
  • [5] Rasin E(2011)Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms Softw Pract Exp 41 23-178
  • [6] Ahmed I(2017)A balanced scheduler with data reuse and replication for scientific workflows in cloud computing Future Gener Comput Syst 74 168-43
  • [7] Socci C(2009)Building a cloud for Yahoo! IEEE Data Eng Bull 32 36-41
  • [8] Severini F(2019)Sales prediction through neural networks for a small dataset Int J Interact Multimed Artif Intell 5 35-31
  • [9] Yasser QR(2006)Simultaneous scheduling of replication and computation for data-intensive applications on the grid Journal of Grid Computing 4 19-489
  • [10] Pretaroli R(2016)Fuzzy FP-tree based data replication management system in cloud Int J Eng Trends Technol 36 481-1194