An efficient approach for low latency processing in stream data

被引:2
作者
Bhatt, Nirav [1 ]
Thakkar, Amit [2 ]
机构
[1] CHARUSAT, Chandubhai S Patel Inst Technol, Informat Technol, Anand, Gujarat, India
[2] CHARUSAT, Chandubhai S Patel Inst Technol, Comp Sci & Engn, Anand, Gujarat, India
关键词
Data stream; Stream processing; Latency; SYSTEMS; GOLD; OIL;
D O I
10.7717/peerj-cs.426
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stream data is the data that is generated continuously from the different data sources and ideally defined as the data that has no discrete beginning or end. Processing the stream data is a part of big data analytics that aims at querying the continuously arriving data and extracting meaningful information from the stream. Although earlier processing of such stream was using batch analytics, nowadays there are applications like the stock market, patient monitoring, and traffic analysis which can cause a drastic difference in processing, if the output is generated in levels of hours and minutes. The primary goal of any real-time stream processing system is to process the stream data as soon as it arrives. Correspondingly, analytics of the stream data also needs consideration of surrounding dependent data. For example, stock market analytics results are often useless if we do not consider their associated or dependent parameters which affect the result. In a real-world application, these dependent stream data usually arrive from the distributed environment. Hence, the stream processing system has to be designed, which can deal with the delay in the arrival of such data from distributed sources. We have designed the stream processing model which can deal with all the possible latency and provide an end-to-end low latency system. We have performed the stock market prediction by considering affecting parameters, such as USD, OIL Price, and Gold Price with an equal arrival rate. We have calculated the Normalized Root Mean Square Error (NRMSE) which simplifies the comparison among models with different scales. A comparative analysis of the experiment presented in the report shows a significant improvement in the result when considering the affecting parameters. In this work, we have used the statistical approach to forecast the probability of possible data latency arrives from distributed sources. Moreover, we have performed preprocessing of stream data to ensure at-least-once delivery semantics. In the direction towards providing low latency in processing, we have also implemented exactly-once processing semantics. Extensive experiments have been performed with varying sizes of the window and data arrival rate. We have concluded that system latency can be reduced when the window size is equal to the data arrival rate.
引用
收藏
页码:1 / 19
页数:19
相关论文
共 50 条
  • [11] Combining it all: Cost minimal and low-latency stream processing across distributed heterogeneous infrastructures
    Roeger, Henriette
    Bhowmik, Sukanya
    Rothermel, Kurt
    MIDDLEWARE'19: PROCEEDINGS OF THE 2019 MIDDLEWARE'19: 20TH INTERNATIONAL MIDDLEWARE CONFERENCE, 2019, : 255 - 267
  • [12] A Game-Theoretic Approach for Elastic Distributed Data Stream Processing
    Mencagli, Gabriele
    ACM TRANSACTIONS ON AUTONOMOUS AND ADAPTIVE SYSTEMS, 2016, 11 (02)
  • [13] A Task Allocation Method for Stream Processing with Recovery Latency Constraint
    Hong-Liang Li
    Jie Wu
    Zhen Jiang
    Xiang Li
    Xiao-Hui Wei
    Journal of Computer Science and Technology, 2018, 33 : 1125 - 1139
  • [14] A Task Allocation Method for Stream Processing with Recovery Latency Constraint
    Li, Hong-Liang
    Wu, Jie
    Jiang, Zhen
    Li, Xiang
    Wei, Xiao-Hui
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2018, 33 (06) : 1125 - 1139
  • [15] Synchronizing data stream processing
    Qureshi, Muhammad S. F.
    Getta, Janusz R.
    PROCEEDINGS OF THE IASTED INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND NETWORKS, 2007, : 233 - +
  • [16] Autopipelining for Data Stream Processing
    Tang, Yuzhe
    Gedik, Bugra
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2013, 24 (12) : 2344 - 2354
  • [17] Optimization of data stream processing
    Getta, JR
    Vossough, E
    SIGMOD RECORD, 2004, 33 (03) : 34 - 39
  • [18] A middleware for efficient stream processing in CUDA
    Nakagawa, Shinta
    Ino, Fumihiko
    Hagihara, Kenichi
    COMPUTER SCIENCE-RESEARCH AND DEVELOPMENT, 2010, 25 (1-2): : 41 - 49
  • [19] Naive Importance Weighting for Data Stream with Intermediate Latency
    Parreira, Pedro Henrique
    Prati, Ronaldo Cristiano
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [20] Adaptive processing for continuous query over data stream
    Bae, Misook
    Hwang, Buhyun
    Nam, Jiseung
    PARALLEL AND DISTRIBUTED PROCESSING AND APPLICATIONS, PROCEEDINGS, 2007, 4742 : 347 - 358