Open Source Initiatives and Frameworks Addressing Distributed Real-time Data Analytics

被引:9
作者
Morshed, Sarwar Jahan [1 ,2 ]
Rana, Juwel [1 ,3 ]
Milrad, Marcelo [4 ]
机构
[1] Linnaeus Univ, Vaxjo, Sweden
[2] Daffodil Int Univ, Dhaka, Bangladesh
[3] Telenor Grp, Oslo, Norway
[4] Linnaeus Univ, Dept Media Technol, Vaxjo, Sweden
来源
2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW) | 2016年
关键词
Real-time; data analytics; big data; streaming data; data analytics framework; distributed real-time data analysis;
D O I
10.1109/IPDPSW.2016.152
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The continuous evolution of digital services, is resulting in the generation of extremely large data sets that are created in almost real time. Exploring new opportunities for improving the quality of these digital services, as well as providing better-personalized experiences to digital users are two major challenges to be addressed. Different methods, tools, and techniques existed today to generate actionable insights from digital services data. Traditionally, big data problems are handled on historical data-sets. However, there is a growing demand on real-time data analytics to offer new services to users and to provide pro-active customers' care, personalized ads, emergency aids, just to give a few examples. Spite of the fact that there are few existing frameworks for real-time analytics, however, utilizing those for solving distributed real-time big data analytical problems stills remains a challenge. Existing real-time data analytics (RTDA) frameworks are not covering all the features that requires for distributed computation in real-time. Therefore, in this paper, we present a qualitative overview and analysis on some of the mostly used existing RTDA frameworks. Specifically, Apache Spark, Apache Flink, Apache Storm, and Apache Samza are covered and discussed in this paper.
引用
收藏
页码:1481 / 1484
页数:4
相关论文
共 13 条
[1]  
Das A., 2003, ACM SIGMOD C JUN
[2]  
Das Sarma A., 2009, VLDB, P85, DOI DOI 10.14778/1687627.1687638
[3]   Parallel Clustering of High-Dimensional Social Media Data Streams [J].
Gao, Xiaoming ;
Ferrara, Emilio ;
Qiu, Judy .
2015 15TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING, 2015, :323-332
[4]  
Garg N, 2013, Apache Kafka
[5]  
Hueske F, 2012, PROC VLDB ENDOW, V5, P1256
[6]  
Kamburugamuve Supun, 2013, TECHNICAL REPORT
[7]  
Kejariwal A, 2015, PROC VLDB ENDOW, V8, P2041
[8]   The real-time city? Big data and smart urbanism [J].
Kitchin, Rob .
GEOJOURNAL, 2014, 79 (01) :1-14
[9]  
Kulkarni S., P 2015 ACM SIGMOD IN, P239
[10]  
Marz Nathan., 2013, STORM DISTRIBUTED FA