SAND: A fault-tolerant streaming architecture for network traffic analytics

被引:0
作者
Liu, Qin [1 ]
Lui, John C. S. [1 ]
He, Cheng [2 ]
Pan, Lujia [2 ]
Fan, Wei [2 ]
Shi, Yunlong [2 ]
机构
[1] Chinese Univ Hong Kong, Hong Kong, Hong Kong, Peoples R China
[2] Huawei Noahs Ark Lab, Hong Kong, Hong Kong, Peoples R China
关键词
Stream processing; Network analytics; Fault-tolerance;
D O I
10.1016/j.jss.2015.07.049
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many long-running network analytics applications (e.g., flow size estimation and heavy traffic detection) impose a high-throughput and high reliability requirements on stream processing systems. However, previous stream processing systems which are designed for higher layer applications cannot sustain high-speed traffic at the core router level. Furthermore, due to the nondeterministic nature of message passing among workers, the fault-tolerant schemes of previous streaming architectures based on the continuous operator model cannot provide strong consistency which is essential for network analytics. In this paper, we present the design and implementation of SAND, a fault-tolerant distributed stream processing system for network analytics. SAND is designed to operate under high-speed network traffic, and it uses a novel checkpointing protocol which can perform failure recovery based on upstream backup and checkpointing. We prove our fault-tolerant scheme provides strong consistency even under multiple node failure. We implement several real-world network analytics applications on SAND, including heavy traffic hitter detection as well as policy and charging control for cellular networks, and we evaluate their performance using network traffic captured from commercial cellular core networks. We demonstrate that SAND can sustain high-speed network traffic and that our fault-tolerant scheme is efficient. (C) 2015 Elsevier Inc. All rights reserved.
引用
收藏
页码:553 / 563
页数:11
相关论文
共 26 条
[1]   MillWheel: Fault-Tolerant Stream Processing at Internet Scale [J].
Akidau, Tyler ;
Balikov, Alex ;
Bekiroglu, Kaya ;
Chernyak, Slava ;
Haberman, Josh ;
Lax, Reuven ;
McVeety, Sam ;
Mills, Daniel ;
Nordstrom, Paul ;
Whittle, Sam .
PROCEEDINGS OF THE VLDB ENDOWMENT, 2013, 6 (11) :1033-1044
[2]   Fault-tolerance in the borealis distributed stream processing system [J].
Balazinska, Magdalena ;
Balakrishnan, Hari ;
Madden, Samuel R. ;
Stonebraker, Michael .
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (01)
[3]   Sequential hashing: A flexible approach for unveiling significant patterns in high speed networks [J].
Bu, Tian ;
Cao, Jin ;
Chen, Aiyou ;
Lee, Patrick P. C. .
COMPUTER NETWORKS, 2010, 54 (18) :3309-3326
[4]   DISTRIBUTED SNAPSHOTS - DETERMINING GLOBAL STATES OF DISTRIBUTED SYSTEMS [J].
CHANDY, KM ;
LAMPORT, L .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1985, 3 (01) :63-75
[5]  
Condie T., 2010, P C NETW SYST DES IM
[6]   What's new: Finding significant differences in network data streams [J].
Cormode, G ;
Muthukrishnan, S .
IEEE-ACM TRANSACTIONS ON NETWORKING, 2005, 13 (06) :1219-1232
[7]  
Cranor Chuck., 2003, ACM SIGMOD
[8]  
Dobrescu M., 2009, P ACM S OP SYST PRIN, V9
[9]   New directions in traffic measurement and accounting: Focusing on the elephants, ignoring the mice [J].
Estan, C ;
Varghese, G .
ACM TRANSACTIONS ON COMPUTER SYSTEMS, 2003, 21 (03) :270-313
[10]  
Fernandez R. Castro, 2013, P ACM SIGMOD INT C M, P725, DOI [DOI 10.1145/2463676.2465282, 10.1145/2463676.2465282]