How Much Data Is Sufficient to Learn High-Performing Algorithms? Generalization Guarantees for Data-Driven Algorithm Design

被引：19

作者：

Balcan, Maria-Florina ^{[1
]}

DeBlasio, Dan ^{[2
]}

Dick, Travis ^{[3
]}

Kingsford, Carl ^{[1
,4
]}

Sandholm, Tuomas ^{[1
,5
,6
,7
]}

Vitercik, Ellen ^{[1
]}

机构：

[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA

[2] Univ Texas El Paso, El Paso, TX 79968 USA

[3] Univ Penn, Philadelphia, PA 19104 USA

[4] Ocean Genom Inc, Pittsburgh, PA USA

[5] Strateg Machine Inc, Pittsburgh, PA USA

[6] Strategy Robot Inc, Pittsburgh, PA USA

[7] Optimized Markets Inc, Pittsburgh, PA USA

来源：

STOC '21: PROCEEDINGS OF THE 53RD ANNUAL ACM SIGACT SYMPOSIUM ON THEORY OF COMPUTING | 2021年

基金：

美国安德鲁·梅隆基金会; 美国国家科学基金会; 美国国家卫生研究院;

关键词：

Automated algorithm design; data-driven algorithm design; automated algorithm configuration; machine learning theory; computational biology; mechanism design;

D O I：

10.1145/3406325.3451036

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Algorithms often have tunable parameters that impact performance metrics such as runtime and solution quality. For many algorithms used in practice, no parameter settings admit meaningful worst-case bounds, so the parameters are made available for the user to tune. Alternatively, parameters may be tuned implicitly within the proof of a worst-case guarantee. Worst-case instances, however, may be rare or nonexistent in practice. A growing body of research has demonstrated that data-driven algorithm design can lead to significant improvements in performance. This approach uses a training set of problem instances sampled from an unknown, application-specific distribution and returns a parameter setting with strong average performance on the training set. We provide a broadly applicable theory for deriving generalization guarantees that bound the difference between the algorithm's average performance over the training set and its expected performance on the unknown distribution. Our results apply no matter how the parameters are tuned, be it via an automated or manual approach. The challenge is that for many types of algorithms, performance is a volatile function of the parameters: slightly perturbing the parameters can cause a large change in behavior. Prior research (e.g., Gupta and Roughgarden, SICOMP'17; Balcan et al., COLT'17, ICML'18, EC'18) has proved generalization bounds by employing case-by-case analyses of greedy algorithms, clustering algorithms, integer programming algorithms, and selling mechanisms. We uncover a unifying structure which we use to prove extremely general guarantees, yet we recover the bounds from prior research. Our guarantees, which are tight up to logarithmic factors in the worst case, apply whenever an algorithm's performance is a piecewise-constant, -linear, or-more generally-piecewise-structured function of its parameters. Our theory also implies novel bounds for voting mechanisms and dynamic programming algorithms from computational biology.

引用

页码：919 / 932

页数：14

共 77 条

[1]

Alabi Daniel, 2019, C LEARNING THEORY CO

[2]

[Anonymous], 1984, STOC

[3]

[Anonymous], 1979, P AGGREGATION REVELA

[4] DENSITY AND DIMENSION [J].

ASSOUAD, P .

ANNALES DE L INSTITUT FOURIER, 1983, 33 (03) :233-282

[5]

Balcan M., 2018, P 35 INT C MACH LEAR, V80

[6]

Balcan M. F., 2020, PR MACH LEARN RES

[7] A General Theory of Sample Complexity for Multi-Item Profit Maximization [J].

Balcan, Maria-Florina ;

Sandholm, Tuomas ;

Vitercik, Ellen .

ACM EC'18: PROCEEDINGS OF THE 2018 ACM CONFERENCE ON ECONOMICS AND COMPUTATION, 2018, :173-174

[8] Dispersion for Data-Driven Algorithm Design, Online Learning, and Private Optimization [J].

Balcan, Maria-Florina ;

Dick, Travis ;

Vitercik, Ellen .

2018 IEEE 59TH ANNUAL SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE (FOCS), 2018, :603-614

[9]

Balcan Maria-Florina, 2020, P INT C LEARNING REP

[10]

Balcan Maria-Florina, 2017, P C LEARN THEOR COLT

← 1 2 3 4 5 6 7 8 →