An analysis of query-agnostic sampling for interactive data exploration

被引:3
作者
Liu, Wenzhao [1 ]
Diao, Yanlei [1 ]
Liu, Anna [1 ]
机构
[1] Univ Massachusetts, Coll Informat & Comp Sci, Amherst, MA 01003 USA
基金
美国国家科学基金会;
关键词
Databases; Interactive data exploration; Query-agnostic sampling;
D O I
10.1080/03610926.2017.1363231
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Data analysts often explore a large database to identify the data of interest, but may not be able to specify the exact query to send to the database. A manual data exploration process is labor intensive and time-consuming. In the new paradigm of system-aided interactive data exploration, the Database Management System presents the samples to the user and engages the user in an interactive exploration process to identify the user interest. In this article, we examine a number of initial sampling techniques to identify at least one positive (i.e., interesting) sample and compare them both theoretically and empirically.
引用
收藏
页码:3820 / 3837
页数:18
相关论文
共 22 条
  • [1] Acharya S, 1999, PROCEEDINGS OF THE TWENTY-FIFTH INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, P754
  • [2] [Anonymous], 2015, PVLDB, DOI DOI 10.14778/2824032.2824099
  • [3] [Anonymous], 2013, P 8 ACM EUR C COMP S
  • [4] [Anonymous], 2012, SAMPLING
  • [5] [Anonymous], 1993, THESIS
  • [6] Optimized stratified sampling for approximate query processing
    Chaudhuri, Surajit
    Das, Gautam
    Narasayya, Vivek
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2007, 32 (02):
  • [7] Continuous Sampling from Distributed Streams
    Cormode, Graham
    Muthukrishnan, S.
    Yi, Ke
    Zhang, Qin
    [J]. JOURNAL OF THE ACM, 2012, 59 (02)
  • [8] Diao Y., 2015, Proc. PVLDB Endow, P1964
  • [9] Explore-by-Example: An Automatic Query Steering Framework for Interactive Data Exploration
    Dimitriadou, Kyriaki
    Papaemmanouil, Olga
    Diao, Yanlei
    [J]. SIGMOD'14: PROCEEDINGS OF THE 2014 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2014, : 517 - 528
  • [10] Hellerstein J. M., 1997, SIGMOD Record, V26, P171, DOI 10.1145/253262.253291