Rapid Sampling for Visualizations with Ordering Guarantees

被引:51
作者
Kim, Albert [1 ]
Blais, Eric [1 ,2 ]
Parameswaran, Aditya [1 ,3 ]
Indyk, Piotr [1 ]
Madden, Sam [1 ]
Rubinfeld, Ronitt [1 ,4 ]
机构
[1] MIT, Cambridge, MA 02139 USA
[2] Univ Waterloo, Waterloo, ON N2L 3G1, Canada
[3] Illinois UIUC, Chicago, IL USA
[4] Tel Aviv Univ, IL-69978 Tel Aviv, Israel
来源
PROCEEDINGS OF THE VLDB ENDOWMENT | 2015年 / 8卷 / 05期
基金
美国国家科学基金会;
关键词
D O I
10.14778/2735479.2735485
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Visualizations are frequently used as a means to understand trends and gather insights from datasets, but often take a long time to generate. In this paper, we focus on the problem of rapidly generating approximate visualizations while preserving crucial visual properties of interest to analysts. Our primary focus will be on sampling algorithms that preserve the visual property of ordering; our techniques will also apply to some other visual properties. For instance, our algorithms can be used to generate an approximate visualization of a bar chart very rapidly, where the comparisons between any two bars are correct. We formally show that our sampling algorithms are generally applicable and provably optimal in theory, in that they do not take more samples than necessary to generate the visualizations with ordering guarantees. They also work well in practice, correctly ordering output groups while taking orders of magnitude fewer samples and much less time than conventional sampling schemes.
引用
收藏
页码:521 / 532
页数:12
相关论文
共 48 条
[1]  
Acharya S, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P574, DOI 10.1145/304181.304581
[2]  
Acharya S, 2000, SIGMOD REC, V29, P487
[3]  
Alon N., 1996, Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, P20, DOI 10.1145/237814.237823
[4]  
[Anonymous], 2003, P ACM SIGMOD 2003
[5]  
Burtless G, 2013, CLOSING THE DEFICIT: HOW MUCH CAN LATER RETIREMENT HELP?, P1
[6]   LOWER BOUNDS FOR SAMPLING ALGORITHMS FOR ESTIMATING THE AVERAGE [J].
CANETTI, R ;
EVEN, G ;
GOLDREICH, O .
INFORMATION PROCESSING LETTERS, 1995, 53 (01) :17-25
[7]  
Casella G., 2001, CENGAGE LEARNING, V2nd ed.
[8]  
Chakrabarti K., 2000, P 26 INT C VER LARG, P111
[9]   Overcoming limitations of sampling for aggregation queries [J].
Chaudhuri, S ;
Das, G ;
Datar, M ;
Motwani, R ;
Narasayya, V .
17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, :534-542
[10]  
Chaudhuri S., 2007, ACM T DATABASE SYST, V32