Query-Based Data Pricing

被引:123
作者
Koutris, Paraschos [1 ]
Upadhyaya, Prasang [1 ]
Balazinska, Magdalena [1 ]
Howe, Bill [1 ]
Suciu, Dan [1 ]
机构
[1] Univ Washington, Seattle, WA 98195 USA
基金
美国国家科学基金会;
关键词
Algorithms; Economics; Theory; Data pricing; arbitrage; query determinacy;
D O I
10.1145/2770870
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Data is increasingly being bought and sold online, and Web-based marketplace services have emerged to facilitate these activities. However, current mechanisms for pricing data are very simple: buyers can choose only from a set of explicit views, each with a specific price. In this article, we propose a framework for pricing data on the Internet that, given the price of a few views, allows the price of any query to be derived automatically. We call this capability query-based pricing. We first identify two important properties that the pricing function must satisfy, the arbitrage-free and discount-free properties. Then, we prove that there exists a unique function that satisfies these properties and extends the seller's explicit prices to all queries. Central to our framework is the notion of query determinacy, and in particular instance-based determinacy: we present several results regarding the complexity and properties of it. When both the views and the query are unions of conjunctive queries or conjunctive queries, we show that the complexity of computing the price is high. To ensure tractability, we restrict the explicit prices to be defined only on selection views (which is the common practice today). We give algorithms with polynomial time data complexity for computing the price of two classes of queries: chain queries (by reducing the problem to network flow), and cyclic queries. Furthermore, we completely characterize the class of conjunctive queries without self-joins that have PTIME data complexity, and prove that pricing all other queries is NP-complete, thus establishing a dichotomy on the complexity of the pricing problem when all views are selection queries.
引用
收藏
页数:44
相关论文
共 24 条
[1]  
Abiteboul S., 1998, Proceedings of the Seventeenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. PODS 1998, P254, DOI 10.1145/275487.275516
[2]  
Abiteboul S., 1995, Foundations of Databases
[3]  
Afrati F, 2007, LECT NOTES COMPUT SC, V4708, P78
[4]  
Balazinska M., 2011, DATA MARKETS CLOUD O
[5]  
Calvanese Diego, 2002, P 21 ACM SIGMOD SIGA, P247
[6]  
Cormen T, 2001, INTRO ALGORITHMS, DOI DOI 10.1145/963770.963776
[7]   An economic model for self-tuned cloud caching [J].
Dash, Debabrata ;
Kantere, Verena ;
Ailamaki, Anastasia .
ICDE: 2009 IEEE 25TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2009, :1687-+
[8]   A Firm Foundation for Private Data Analysis [J].
Dwork, Cynthia .
COMMUNICATIONS OF THE ACM, 2011, 54 (01) :86-95
[9]   Schema Mapping Discovery from Data Instances [J].
Gottlob, Georg ;
Senellart, Pierre .
JOURNAL OF THE ACM, 2010, 57 (02)
[10]   On the content of materialized aggregate views [J].
Grumbach, S ;
Tininini, L .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2003, 66 (01) :133-168