A Study on Big Data Processing Frameworks: Spark and Storm

被引：3

作者：

Deshai, N. ^{[1
]}

Venkataramana, S. ^{[1
]}

Sekhar, B. V. D. S. ^{[1
]}

Srinivas, K. ^{[1
]}

Varma, G. P. Saradhi ^{[1
]}

机构：

[1] JNTUK, SRKR Engn Coll, Dept Informat Technol, Bhimavaram, Andhra Pradesh, India

来源：

SMART INTELLIGENT COMPUTING AND APPLICATIONS, VOL 2 | 2020年 / 160卷

关键词：

Big data; Hadoop; Mapreduce; Yarn; Spark; Storm;

D O I：

10.1007/978-981-32-9690-9_43

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Today's internet world impose a trade-off between Peta-byte to Exa-byte being created in digital computer world attributable enormous volume of unstructured datasets being generating from diverse social sites, IOT, Google, Twitter, Yahoo, monitoring surroundings through sensors, etc., is big data (BD). Because second to second doubles the datasets volume size but the shortage of smooth dynamic processing, analysis and scalability techniques. Because the recent high-speed decade we applied only extant methods and common tools about the gigabyte data process and perform computations on whole world huge data. Apache open free source Hadoop is the latest BD weapon can process zetta-byte dimensions of databases by its most developed and popular components as HDFS and map reduce (MR), to get done excellent storage features magnificent and reliable processing on zetta-byte of datasets. MR likes more famous software, popular framework for handling BD existing issues with full parallel, highly distributed, and most scalable manner. Despite, Hadoop, map and reduces tasks have more limitations like poor allocating custom resources, stream way processing, shortage of latency, the deficit of efficient performance, imperfection of optimization, the real-time trend of computations and diverse logical elucidation. We significant most modern progressive features computing procedures. This examination paper shows Apache fastest spark tool, world latest and fastest tool is apache storm has efficient frameworks to conquer those limitations.

引用

页码：415 / 424

页数：10

共 20 条

[1] The Stratosphere platform for big data analytics [J].

Alexandrov, Alexander ;

Bergmann, Rico ;

Ewen, Stephan ;

Freytag, Johann-Christoph ;

Hueske, Fabian ;

Heise, Arvid ;

Kao, Odej ;

Leich, Marcus ;

Leser, Ulf ;

Markl, Volker ;

Naumann, Felix ;

Peters, Mathias ;

Rheinlaender, Astrid ;

Sax, Matthias J. ;

Schelter, Sebastian ;

Hoeger, Mareike ;

Tzoumas, Kostas ;

Warneke, Daniel .

VLDB JOURNAL, 2014, 23 (06) :939-964

[2] FlumeJava']Java: Easy, Efficient Data-Parallel Pipelines [J].

Chambers, Craig ;

Raniwala, Ashish ;

Perry, Frances ;

Adams, Stephen ;

Henry, Robert R. ;

Bradshaw, Robert ;

Weizenbaum, Nathan .

PLDI '10: PROCEEDINGS OF THE 2010 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION, 2010, :363-375

[3] Data-intensive applications, challenges, techniques and technologies: A survey on Big Data [J].

Chen, C. L. Philip ;

Zhang, Chun-Yang .

INFORMATION SCIENCES, 2014, 275 :314-347

[4]

Dean J, 2004, USENIX ASSOCIATION PROCEEDINGS OF THE SIXTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION (OSDE '04), P137

[5] Beyond the hype: Big data concepts, methods, and analytics [J].

Gandomi, Amir ;

Haider, Murtaza .

INTERNATIONAL JOURNAL OF INFORMATION MANAGEMENT, 2015, 35 (02) :137-144

[6]

Hindman B., 2011, PROC USENIX C NETWOR, P22

[7]

Hunt P., 2010, P 2010 USENIX C USEN, V8, P9, DOI DOI 10.5555/1855840.1855851

[8]

Landset S., 2015, J BIG DATA, V2, P24, DOI [10.1186/s40537-015-0032-1, DOI 10.1186/S40537-015-0032-1]

[9] Survey of Real-time Processing Systems for Big Data [J].

Liu, Xiufeng ;

Iftikhar, Nadeem ;

Xie, Xike .

PROCEEDINGS OF THE 18TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM (IDEAS14), 2014, :356-361

[10]

Malewicz Grzegorz, 2010, INT C MAN DAT, P135, DOI [10.1145/1807167.1807184, DOI 10.1145/1807167.1807184]

← 1 2 →