Performance Analysis of MySQL Partition, Hive Partition-Bucketing and Apache Pig

被引:0
作者
Kumar, A. Sunny [1 ]
机构
[1] GZSCCET, Comp Sci Engn & Technol, Bhatinda, India
来源
2016 1ST INDIA INTERNATIONAL CONFERENCE ON INFORMATION PROCESSING (IICIP) | 2016年
关键词
Big Data; Hive; MySQL; Pig; Partitioning; Bucketing; Hadoop framework;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Streaming data analysis has attracted attention in various applications like financial records, data analysis, etc. Such type of applications require continuous storage of large amount of data in data warehouse while simultaneously providing quick response time for the queries against the data that is stored in the system. The duration of fetching data varies depending on type of data required from the system. This paper presents the performance estimates in terms of MySQL Partition, Hive partition-bucketing and Apache Pig framework. In this paper, big data eco systems and comparative performance analysis of frequently used data retrieval techniques such as MySQL, Hive and Pig are described. From the work presented in the paper, it is concluded that the execution time for extracting data becomes very large with growth in data size, particularly in case of MySQL. As compared to MySQL, Hive and Pig takes less time and give better results.
引用
收藏
页数:6
相关论文
共 10 条
  • [1] [Anonymous], UNDERSTANDING BIG DA
  • [2] [Anonymous], 2016, ORACLE DOCUMENTATION
  • [3] Choudhary Anshu, 2015, QUERY EXECUTION PERF, V9, P91
  • [4] Dhawan S., 2013, AM INT J RES SCI TEC, V2, P88
  • [5] Fuad Ammar, 2014, PROCESSING PERFORMAN
  • [6] Gates Alan, 2011, PROGRAMMING PIG OREI
  • [7] Gruenheid Anja, 2011, QUERY OPTIMIZATION U
  • [8] Gupta Manju, 2014, INT J ADV RES COMPUT, V4, P700
  • [9] Letkowski J., 2014, J TECHNOLOGY RES, V6
  • [10] Murthy A.C., 2014, APACHE HADOOP YARN M