Sequential Interdiction with Incomplete Information and Learning

被引：28

作者：

Borrero, Juan S. ^{[1
]}

Prokopyev, Oleg A. ^{[2
]}

Saure, Denis ^{[3
]}

机构：

[1] Oklahoma State Univ, Sch Ind Engn & Management, Stillwater, OK 74078 USA

[2] Univ Pittsburgh, Dept Ind Engn, Pittsburgh, PA 15261 USA

[3] Univ Chile, Dept Ind Engn, Santiago 8370456, Chile

来源：

OPERATIONS RESEARCH | 2019年 / 67卷 / 01期

基金：

美国国家科学基金会;

关键词：

bilevel programming; attacker-defender; interdiction; learning; incomplete information; online optimization; robust optimization; BILEVEL; ALGORITHM;

D O I：

10.1287/opre.2018.1773

中图分类号：

C93 [管理学];

学科分类号：

12 ; 1201 ; 1202 ; 120202 ;

摘要：

We present a framework for a class of sequential decision-making problems in the context of general interdiction problems, in which a leader and a follower repeatedly interact. At each period, the leader allocates resources to disrupt the performance of the follower (e.g., as in defender-attacker or network interdiction problems), who, in turn, minimizes some cost function over a set of activities that depends on the leader's decision. Although the follower has complete knowledge of the follower's problem, the leader has only partial information and needs to learn about the cost parameters, available resources, and the follower's activities from the feedback generated by the follower's actions. We measure policy performance in terms of its time-stability, defined as the number of periods it takes for the leader to match the actions of an oracle with complete information. In particular, we propose a class of greedy and robust policies and show that these policies are weakly optimal, eventually match the oracle's actions, and provide a real-time certificate of optimality. We also study a lower bound on any policy performance based on the notion of a semioracle. Our numerical experiments demonstrate that the proposed policies consistently outperform a reasonable benchmark and perform fairly close to the semioracle.

引用

页码：72 / 89

页数：18

共 31 条

[1] A Dynamic Near-Optimal Algorithm for Online Linear Programming [J].

Agrawal, Shipra ;

Wang, Zizhuo ;

Ye, Yinyu .

OPERATIONS RESEARCH, 2014, 62 (04) :876-890

[2]

[Anonymous], 2014, Integer and Combinatorial Optimization

[3]

[Anonymous], 2011, WILEY ENCY OPERATION

[4]

[Anonymous], 2009, P C LEARN THEOR COLT

[5] Links between linear bilevel and mixed 0-1 programming problems [J].

Audet, C ;

Hansen, P ;

Jaumard, B ;

Savard, G .

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1997, 93 (02) :273-300

[6] Regret in Online Combinatorial Optimization [J].

Audibert, Jean-Yves ;

Bubeck, Sebastien ;

Lugosi, Gabor .

MATHEMATICS OF OPERATIONS RESEARCH, 2014, 39 (01) :31-45

[7]

Auer P, 2003, SIAM J COMPUT, V32, P48, DOI 10.1137/S0097539701398375

[8] A bilevel programming approach to determining tax credits for biofuel production [J].

Bard, JF ;

Plummer, J ;

Sourie, JC .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2000, 120 (01) :30-46

[9]

BenTal A, 2009, PRINC SER APPL MATH, P1

[10]

Bertsimas D., 1997, INTRO LINEAR OPTIMIZ, V6

← 1 2 3 4 →