The Subset Assignment Problem for Data Placement in Caches

被引:5
作者
Ghandeharizadeh, Shahram [1 ]
Irani, Sandy [2 ]
Lam, Jenny [3 ]
机构
[1] Univ Southern Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
[2] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[3] San Jose State Univ, Dept Comp Sci, San Jose, CA 95192 USA
关键词
Memory management; Caching; Simplexmethod; Linear programming; Minimum cost flow; ALGORITHMS;
D O I
10.1007/s00453-017-0403-4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. The subset assignment problem models the problem of managing a cache composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number n of data objects (items) is very large and the number d of memory banks (bins) is a small constant (on the order of 3 or 4). Therefore, the goal is to determine an optimal assignment in time that minimizes dependence on n. The integral version of this problem is NP-hard since it is a generalization of the knapsack problem. We focus on an efficient solution to the LP relaxation as the number of fractionally assigned items will be at most d. If the data objects are small with respect to the size of the memory banks, the effect of excluding the fractionally assigned data items from the cache will be small. We give an algorithm that solves the LP relaxation and runs in time O(((3d)(d+1))), poly(d)n log(n) log(nC)log(Z)), where Z is the maximum item size and C the maximum storage cost.
引用
收藏
页码:2201 / 2220
页数:20
相关论文
共 20 条
[1]  
Ahuja R. K., 1993, NETWORK FLOWS THEORY, DOI [10.1016/0166-218X(94)90171-6, DOI 10.1016/0166-218X(94)90171-6]
[2]   IMPROVED ALGORITHMS FOR BIPARTITE NETWORK FLOW [J].
AHUJA, RK ;
ORLIN, JB ;
STEIN, C ;
TARJAN, RE .
SIAM JOURNAL ON COMPUTING, 1994, 23 (05) :906-933
[3]  
[Anonymous], 2013, 10 USENIX S NETW SYS
[4]  
[Anonymous], 2013, P 2013 ACM SIGMOD IN, DOI DOI 10.1145/2463676.2465296
[5]  
Barahmand S., 2013, CIDR
[6]  
Chekuri C, 2000, PROCEEDINGS OF THE ELEVENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, P213
[7]  
Cormen T. H., 2009, Introduction to Algorithms, V3rd
[8]  
Ghandeharizadeh S., 2015, 201501 USC DAT LAB
[9]  
Ghandeharizadeh S., 2014, 201407 USC DAT LAB
[10]   CAMP: A Cost Adaptive Multi-Queue Eviction Policy for Key-Value Stores [J].
Ghandeharizadeh, Shahram ;
Irani, Sandy ;
Lam, Jenny ;
Yap, Jason .
ACM/IFIP/USENIX MIDDLEWARE 2014, 2014, :289-300