Regular Programming for Quantitative Properties of Data Streams

被引:37
作者
Alur, Rajeev [1 ]
Fisman, Dana [1 ]
Raghothaman, Mukund [1 ]
机构
[1] Univ Penn, Philadelphia, PA 19104 USA
来源
PROGRAMMING LANGUAGES AND SYSTEMS (ESOP 2016) | 2016年 / 9632卷
关键词
CONTINUOUS QUERIES; LANGUAGE;
D O I
10.1007/978-3-662-49498-1_2
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We propose quantitative regular expressions (QREs) as a high-level programming abstraction for specifying complex numerical queries over data streams in a modular way. Our language allows the arbitrary nesting of orthogonal sets of combinators: (a) generalized versions of choice, concatenation, and Kleene-iteration from regular expressions, (b) streaming (serial) composition, and (c) numerical operators such as min, max, sum, difference, and averaging. Instead of requiring the programmer to figure out the low-level details of what state needs to be maintained and how to update it while processing each data item, the regular constructs facilitate a global view of the entire data stream splitting it into different cases and multiple chunks. The key technical challenge in defining our language is the design of typing rules that can be enforced efficiently and which strike a balance between expressiveness and theoretical guarantees for well-typed programs. We describe how to compile each QRE into an efficient streaming algorithm. The time and space complexity is dependent on the complexity of the data structure for representing terms over the basic numerical operators. In particular, we show that when the set of numerical operations is sum, difference, minimum, maximum, and average, the compiled algorithm uses constant space and processes each symbol in the data stream in constant time out-putting the cost of the stream processed so far. Finally, we prove that the expressiveness of QREs coincides with the streaming composition of regular functions, that is, MSO-definable string-to-term transformations, leading to a potentially robust foundation for understanding their expressiveness and the complexity of analysis problems.
引用
收藏
页码:15 / 40
页数:26
相关论文
共 37 条
[1]   Aurora: a new model and architecture for data stream management [J].
Abadi, DJ ;
Carney, D ;
Cetintemel, U ;
Cherniack, M ;
Convey, C ;
Lee, S ;
Stonebraker, M ;
Tatbul, N ;
Zdonik, S .
VLDB JOURNAL, 2003, 12 (02) :120-139
[2]  
Alon N., 1996, Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, P20, DOI 10.1145/237814.237823
[3]  
Alur R., 2014, 29 ANN S LOG COMP SC
[4]  
Alur R, 2015, ACM SIGPLAN NOTICES, V50, P125, DOI [10.1145/2676726.2676981, 10.1145/2775051.2676981]
[5]   Regular Functions and Cost Register Automata [J].
Alur, Rajeev ;
D'Antoni, Loris ;
Deshmukh, Jyotirmoy ;
Raghothaman, Mukund ;
Yuan, Yifei .
2013 28TH ANNUAL IEEE/ACM SYMPOSIUM ON LOGIC IN COMPUTER SCIENCE (LICS), 2013, :13-22
[6]  
Alur R, 2012, LECT NOTES COMPUT SC, V7392, P42, DOI 10.1007/978-3-642-31585-5_8
[7]   Streaming Transducers for Algorithmic Verification of Single-pass List-processing Programs [J].
Alur, Rajeev ;
Cerny, Pavol .
POPL 11: PROCEEDINGS OF THE 38TH ANNUAL ACM SIGPLAN-SIGACT SYMPOSIUM ON PRINCIPLES OF PROGRAMMING LANGUAGES, 2011, :599-610
[8]  
[Anonymous], 2013, Introduction to the Theory of Computation
[9]  
Arasu A, 2004, LECT NOTES COMPUT SC, V2921, P1
[10]  
Babu S, 2001, SIGMOD REC, V30, P109, DOI 10.1145/603867.603884