Developing a Real-time Data Analytics Framework For Twitter Streaming Data

被引:20
|
作者
Yadranjiaghdam, Babak [1 ]
Yasrobi, Seyedfaraz [1 ]
Tabrizi, Nasseh [1 ]
机构
[1] East Carolina Univ, Dept Comp Sci, Greenville, NC 27858 USA
来源
2017 IEEE 6TH INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS 2017) | 2017年
关键词
Streaming processing; Big Data; Kafka; Spark; Twitter; Real-time; BIG DATA;
D O I
10.1109/BigDataCongress.2017.49
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter is an online social networking service with more than 300 million users, generating a huge amount of information every day. Twitter's most important characteristic is its ability for users to tweet about events, situations, feelings, opinions, or even something totally new, in real time. Currently there are different workflows offering real-time data analysis for Twitter, presenting general processing over streaming data. This study will attempt to develop an analytical framework with the ability of in-memory processing to extract and analyze structured and unstructured Twitter data. The proposed framework includes data ingestion, stream processing, and data visualization components with the Apache Kafka messaging system that is used to perform data ingestion task. Furthermore, Spark makes it possible to perform sophisticated data processing and machine learning algorithms in real time. We have conducted a case study on tweets about the earthquake in Japan and the reactions of people around the world with analysis on the time and origin of the tweets.
引用
收藏
页码:329 / 336
页数:8
相关论文
共 50 条
  • [31] Text Mining and Real-Time Analytics of Twitter Data: A Case Study of Australian Hay Fever Prediction
    Subramani, Sudha
    Michalska, Sandra
    Wang, Hua
    Whittaker, Frank
    Heyward, Benjamin
    HEALTH INFORMATION SCIENCE (HIS 2018), 2018, 11148 : 134 - 145
  • [32] Beyond Batch Processing: Towards Real-Time and Streaming Big Data
    Shahrivari, Saeed
    COMPUTERS, 2014, 3 (04) : 117 - 129
  • [33] Benchmarking real-time vehicle data streaming models for a smart city
    Fernandez-Rodriguez, Jorge Y.
    Alvarez-Garcia, Juan A.
    Arias Fisteus, Jesus
    Luaces, Miguel R.
    Corcoba Magana, Victor
    INFORMATION SYSTEMS, 2017, 72 : 62 - 76
  • [34] VALID: A Web Framework for Visual Analytics of Large Streaming Data
    Li, Chenhui
    Baciu, George
    2014 IEEE 13TH INTERNATIONAL CONFERENCE ON TRUST, SECURITY AND PRIVACY IN COMPUTING AND COMMUNICATIONS (TRUSTCOM), 2014, : 686 - 692
  • [35] A Scalable Streaming Big Data Architecture for Real-Time Sentiment Analysis
    Ayvaz, Serkan
    Shiha, Mohammed O.
    PROCEEDINGS OF 2018 2ND INTERNATIONAL CONFERENCE ON CLOUD AND BIG DATA COMPUTING (ICCBDC 2018), 2018, : 47 - 51
  • [36] A survey on data stream, big data and real-time
    Gomes E.H.A.
    Plentz P.D.M.
    De Rolt C.R.
    Dantas M.A.R.
    International Journal of Networking and Virtual Organisations, 2019, 20 (02) : 143 - 167
  • [37] Mapping the Big Data Landscape: Technologies, Platforms and Paradigms for Real-Time Analytics of Data Streams
    Dubuc, Timothee
    Stahl, Frederic
    Roesch, Etienne B.
    IEEE ACCESS, 2021, 9 : 15351 - 15374
  • [38] A Robust Architectural Framework for Big Data Stream Computing in Personal Healthcare Real Time Analytics
    Vanathi, R.
    Khadir, A. Shaik Abdul
    2017 2ND WORLD CONGRESS ON COMPUTING AND COMMUNICATION TECHNOLOGIES (WCCCT), 2017, : 97 - 104
  • [39] Real-time event detection from the Twitter data stream using the TwitterNews plus Framework
    Hasan, Mahmud
    Orgun, Mehmet A.
    Schwitter, Rolf
    INFORMATION PROCESSING & MANAGEMENT, 2019, 56 (03) : 1146 - 1165
  • [40] A Column Store Engine for Real-Time Streaming Analytics
    Skidanov, Alex
    Papito, Anders J.
    Prout, Adam
    2016 32ND IEEE INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2016, : 1287 - 1297