Developing a Real-time Data Analytics Framework For Twitter Streaming Data

被引:21
作者
Yadranjiaghdam, Babak [1 ]
Yasrobi, Seyedfaraz [1 ]
Tabrizi, Nasseh [1 ]
机构
[1] East Carolina Univ, Dept Comp Sci, Greenville, NC 27858 USA
来源
2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017) | 2017年
关键词
Streaming processing; Big Data; Kafka; Spark; Twitter; Real-time; BIG DATA;
D O I
10.1109/BigDataCongress.2017.49
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter is an online social networking service with more than 300 million users, generating a huge amount of information every day. Twitter's most important characteristic is its ability for users to tweet about events, situations, feelings, opinions, or even something totally new, in real time. Currently there are different workflows offering real-time data analysis for Twitter, presenting general processing over streaming data. This study will attempt to develop an analytical framework with the ability of in-memory processing to extract and analyze structured and unstructured Twitter data. The proposed framework includes data ingestion, stream processing, and data visualization components with the Apache Kafka messaging system that is used to perform data ingestion task. Furthermore, Spark makes it possible to perform sophisticated data processing and machine learning algorithms in real time. We have conducted a case study on tweets about the earthquake in Japan and the reactions of people around the world with analysis on the time and origin of the tweets.
引用
收藏
页码:329 / 336
页数:8
相关论文
共 21 条
[1]  
[Anonymous], INT C HIGH PERF COMP
[2]  
Bifet A, 2013, INFORM-J COMPUT INFO, V37, P15
[3]   Big data architecture for construction waste analytics (CWA): A conceptual framework [J].
Bilal, Muhammad ;
Oyedele, Lukumon O. ;
Akinade, Olugbenga O. ;
Ajayi, Saheed O. ;
Alaka, Hafiz A. ;
Owolabi, Hakeem A. ;
Qadir, Junaid ;
Pasha, Maruf ;
Bello, Sururah A. .
JOURNAL OF BUILDING ENGINEERING, 2016, 6 :144-156
[4]   Developing a Real-Time Data Analytics Framework using Hadoop [J].
Cha, Sangwhan ;
Wachowicz, Monica .
2015 IEEE INTERNATIONAL CONGRESS ON BIG DATA - BIGDATA CONGRESS 2015, 2015, :657-660
[5]  
Chowdhury M. Z. Mosharaf, 2012, NSDI 12 P 9 USENIX C
[6]   #Earthquake: Twitter as a Distributed Sensor System [J].
Crooks, Andrew ;
Croitoru, Arie ;
Stefanidis, Anthony ;
Radzikowski, Jacek .
TRANSACTIONS IN GIS, 2013, 17 (01) :124-147
[7]   Earthquake Twitter [J].
Earle, Paul .
NATURE GEOSCIENCE, 2010, 3 (04) :221-222
[8]   Twitter earthquake detection: earthquake monitoring in a social world [J].
Earle, Paul S. ;
Bowden, Daniel C. ;
Guy, Michelle .
ANNALS OF GEOPHYSICS, 2011, 54 (06) :708-715
[9]   MRSL: Autonomous Neural Network-Based 3-D Positioning System [J].
Hedayati, Hooman ;
Tabrizi, Nasseh .
2015 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2015, :170-174
[10]  
Jones M.T., 2013, PROCESS REAL TIME BI