Developing a Real-time Data Analytics Framework For Twitter Streaming Data

被引:20
|
作者
Yadranjiaghdam, Babak [1 ]
Yasrobi, Seyedfaraz [1 ]
Tabrizi, Nasseh [1 ]
机构
[1] East Carolina Univ, Dept Comp Sci, Greenville, NC 27858 USA
来源
2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017) | 2017年
关键词
Streaming processing; Big Data; Kafka; Spark; Twitter; Real-time; BIG DATA;
D O I
10.1109/BigDataCongress.2017.49
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter is an online social networking service with more than 300 million users, generating a huge amount of information every day. Twitter's most important characteristic is its ability for users to tweet about events, situations, feelings, opinions, or even something totally new, in real time. Currently there are different workflows offering real-time data analysis for Twitter, presenting general processing over streaming data. This study will attempt to develop an analytical framework with the ability of in-memory processing to extract and analyze structured and unstructured Twitter data. The proposed framework includes data ingestion, stream processing, and data visualization components with the Apache Kafka messaging system that is used to perform data ingestion task. Furthermore, Spark makes it possible to perform sophisticated data processing and machine learning algorithms in real time. We have conducted a case study on tweets about the earthquake in Japan and the reactions of people around the world with analysis on the time and origin of the tweets.
引用
收藏
页码:329 / 336
页数:8
相关论文
共 50 条
  • [21] A Big Data Architecture for Near Real-time Traffic Analytics
    Gong, Yikai
    Rimba, Paul
    Sinnott, Richard O.
    COMPANION PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC'17 COMPANION), 2017, : 157 - 162
  • [22] Real-Time Bigdata Analytics: A Stream Data Mining Approach
    Tidke, Bharat
    Mehta, Rupa G.
    Dhanani, Jenish
    RECENT FINDINGS IN INTELLIGENT COMPUTING TECHNIQUES, VOL 2, 2018, 708 : 345 - 351
  • [23] A Survey on Real-time Big Data Analytics: Applications and Tools
    Yadranjiaghdam, Babak
    Pool, Nathan
    Tabrizi, Nasseh
    2016 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE & COMPUTATIONAL INTELLIGENCE (CSCI), 2016, : 404 - 409
  • [24] Open Source Initiatives and Frameworks Addressing Distributed Real-time Data Analytics
    Morshed, Sarwar Jahan
    Rana, Juwel
    Milrad, Marcelo
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 1481 - 1484
  • [25] Real-Time Data Harvesting Method for Czech Twitter
    Kral, Pavel
    Rajtmajer, Vaclav
    ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2017, : 259 - 265
  • [26] Real-Time Streaming Data Delivery over Named Data Networking
    Gusev, Peter
    Wang, Zhehao
    Burke, Jeff
    Zhang, Lixia
    Yoneda, Takahiro
    Ohnishi, Ryota
    Muramoto, Eiichi
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2016, E99B (05) : 974 - 991
  • [27] A task-level adaptive Map Reduce framework for real-time streaming data in healthcare applications
    Zhang, Fan
    Cao, Junwei
    Khan, Samee U.
    Li, Keqin
    Hwang, Kai
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2015, 43-44 : 149 - 160
  • [28] Real-Time Effective Framework for Unstructured Data Mining
    Lomotey, Richard K.
    Deters, Ralph
    2013 12TH IEEE INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM 2013), 2013, : 1081 - 1088
  • [29] Real-time Spread Burst Detection in Data Streaming
    Wang, Haibo
    Melissourgos, Dimitrios
    Ma, Chaoyi
    Chen, Shigang
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2023, 7 (02) : 1 - 31
  • [30] Real-Time Spread Burst Detection in Data Streaming
    Wang H.
    Melissourgos D.
    Ma C.
    Chen S.
    Performance Evaluation Review, 2023, 51 (01): : 51 - 52