Developing a Real-time Data Analytics Framework For Twitter Streaming Data

被引:20
|
作者
Yadranjiaghdam, Babak [1 ]
Yasrobi, Seyedfaraz [1 ]
Tabrizi, Nasseh [1 ]
机构
[1] East Carolina Univ, Dept Comp Sci, Greenville, NC 27858 USA
关键词
Streaming processing; Big Data; Kafka; Spark; Twitter; Real-time; BIG DATA;
D O I
10.1109/BigDataCongress.2017.49
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Twitter is an online social networking service with more than 300 million users, generating a huge amount of information every day. Twitter's most important characteristic is its ability for users to tweet about events, situations, feelings, opinions, or even something totally new, in real time. Currently there are different workflows offering real-time data analysis for Twitter, presenting general processing over streaming data. This study will attempt to develop an analytical framework with the ability of in-memory processing to extract and analyze structured and unstructured Twitter data. The proposed framework includes data ingestion, stream processing, and data visualization components with the Apache Kafka messaging system that is used to perform data ingestion task. Furthermore, Spark makes it possible to perform sophisticated data processing and machine learning algorithms in real time. We have conducted a case study on tweets about the earthquake in Japan and the reactions of people around the world with analysis on the time and origin of the tweets.
引用
收藏
页码:329 / 336
页数:8
相关论文
共 50 条
  • [31] A spark-based big data analysis framework for real-time sentiment prediction on streaming data
    Kilinc, Deniz
    SOFTWARE-PRACTICE & EXPERIENCE, 2019, 49 (09): : 1352 - 1364
  • [32] HCache: A Hash-based Hybrid Caching Model for Real-Time Streaming Data Analytics
    Zhao, Feng
    Li, Shaofeng
    Zhou, Bing Bing
    Jin, Hai
    Yang, Laurence T.
    IEEE TRANSACTIONS ON SERVICES COMPUTING, 2021, 14 (05) : 1384 - 1396
  • [33] REAL-TIME BIG DATA ANALYTICS FRAMEWORK WITH DATA BLENDING APPROACH FOR MULTIPLE DATA SOURCES IN SMART CITY APPLICATIONS
    Manjunatha, S.
    Annappa, B.
    SCALABLE COMPUTING-PRACTICE AND EXPERIENCE, 2020, 21 (04): : 611 - 623
  • [34] Real-time Streaming Technology and Analytics for Insights
    Shim, J. P.
    Nisar, Karan
    DIGITAL INNOVATION AND ENTREPRENEURSHIP (AMCIS 2021), 2021,
  • [35] Real-time Traffic Classification with Twitter Data Mining
    Kurniawan, Dwi Aji
    Wibirama, Sunu
    Setiawan, Noor Akhmad
    PROCEEDINGS OF 2016 8TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND ELECTRICAL ENGINEERING (ICITEE), 2016,
  • [36] Scalable and Real-time Sentiment Analysis of Twitter Data
    Karanasou, Maria
    Ampla, Anneta
    Doulkeridis, Christos
    Halkidi, Maria
    2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2016, : 944 - 951
  • [37] A Machine Hearing Framework for Real-Time Streaming Analytics Using Lambda Architecture
    Demertzis, Konstantinos
    Iliadis, Lazaros
    Anezakis, Vardis-Dimitris
    ENGINEERING APPLICATIONS OF NEURAL NETWORKSX, 2019, 1000 : 246 - 261
  • [38] Real-Time Data Harvesting Method for Czech Twitter
    Kral, Pavel
    Rajtmajer, Vaclav
    ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2017, : 259 - 265
  • [39] Real-Time Streaming Data Delivery over Named Data Networking
    Gusev, Peter
    Wang, Zhehao
    Burke, Jeff
    Zhang, Lixia
    Yoneda, Takahiro
    Ohnishi, Ryota
    Muramoto, Eiichi
    IEICE TRANSACTIONS ON COMMUNICATIONS, 2016, E99B (05) : 974 - 991
  • [40] A Methodology of Real-Time Data Fusion for Localized Big Data Analytics
    Jabbar, Sohail
    Malik, Kaleem R.
    Ahmad, Mudassar
    Aldabbas, Omar
    Asif, Muhammad
    Khalid, Shehzad
    Han, Kijun
    Ahmed, Syed Hassan
    IEEE ACCESS, 2018, 6 : 24510 - 24520