Aras: A method with uniform distributed dataset to solve data warehouse problems for big data

被引:3
作者
Barkhordari M. [1 ]
Niamanesh M. [1 ]
机构
[1] Information and Communication Technology Research Center, Advance Information System Research Group, Tehran
关键词
Big data; Data locality; Data warehouse; Mapreduce;
D O I
10.4018/IJDST.2017040104
中图分类号
学科分类号
摘要
Because of to the high rate of data growth and the need for data analysis, data warehouse management for big data is an important issue. Single node solutions cannot manage the large amount of information. Information must be distributed over multiple hardware nodes. Nevertheless, data distribution over nodes causes each node to need data from other nodes to execute a query. Data exchange among nodes creates problems, such as the joins between data segments that exist on different nodes, network congestion, and hardware node wait for data reception. In this paper, the Aras method is proposed. This method is a MapReduce-based method that introduces a data set on each mapper. By applying this method, each mapper node can execute its query independently and without need to exchange data with other nodes. Node independence solves the aforementioned data distribution problems. The proposed method has been compared with prominent data warehouses for big data, and the Aras query execution time was much lower than other methods. © 2017, IGI Global.
引用
收藏
页码:47 / 60
页数:13
相关论文
共 22 条
[21]  
Yang C., Yen C., Tan C., Madden S.R., Osprey: Implementing MapReduce-style fault tolerance in a shared-nothing distributed database, Proceedings of the 2010 IEEE 26th International Conference on Data Engineering (ICDE), pp. 657-668, (2010)
[22]  
Zaharia M., Chowdhury M., Franklin M.J., Shenker S., Stoica I., Spark: Cluster computing with working sets, HotCloud, 10, 10, (2010)