From Predictive to Prescriptive Analytics

被引:316
作者
Bertsimas, Dimitris [1 ]
Kallus, Nathan [2 ,3 ]
机构
[1] MIT, Ctr Operat Res, Cambridge, MA 02139 USA
[2] Cornell Univ, Cornell Tech, New York, NY 10044 USA
[3] Cornell Univ, Sch Operat Res & Informat Engn, New York, NY 10044 USA
基金
美国国家科学基金会;
关键词
data-driven decision making; machine learning; stochastic optimization; REGRESSION; CONVERGENCE; SEARCH;
D O I
10.1287/mnsc.2018.3253
中图分类号
C93 [管理学];
学科分类号
12 ; 1201 ; 1202 ; 120202 ;
摘要
We combine ideas from machine learning (ML) and operations research and management science (OR/MS) in developing a framework, along with specific methods, for using data to prescribe optimal decisions in OR/MS problems. In a departure from other work on data-driven optimization, we consider data consisting, not only of observations of quantities with direct effect on costs/revenues, such as demand or returns, but also predominantly of observations of associated auxiliary quantities. The main problem of interest is a conditional stochastic optimization problem, given imperfect observations, where the joint probability distributions that specify the problem are unknown. We demonstrate how our proposed methods are generally applicable to a wide range of decision problems and prove that they are computationally tractable and asymptotically optimal under mild conditions, even when data are not independent and identically distributed and for censored observations. We extend these to the case in which some decision variables, such as price, may affect uncertainty and their causal effects are unknown. We develop the coefficient of prescriptiveness P to measure the prescriptive content of data and the efficacy of a policy from an operations perspective. We demonstrate our approach in an inventory management problem faced by the distribution arm of a large media company, shipping 1 billion units yearly. We leverage both internal data and public data harvested from IMDb, Rotten Tomatoes, and Google to prescribe operational decisions that outperform baseline measures. Specifically, the data we collect, leveraged by our methods, account for an 88% improvement as measured by our coefficient of prescriptiveness.
引用
收藏
页码:1025 / 1044
页数:20
相关论文
共 51 条
[1]  
[Anonymous], 2000, EMPIRICAL PROCESSES
[2]  
[Anonymous], 2010, OPER RES
[3]  
[Anonymous], 2016, PREPRINT
[4]  
[Anonymous], 2009, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, DOI DOI 10.1007/978
[5]  
[Anonymous], STAT DECISION THEORY
[6]  
[Anonymous], 2009, Advances in Neural Information Processing Systems
[7]  
[Anonymous], 2005, MICROECONOMETRICS
[8]   An optimal algorithm for approximate nearest neighbor searching in fixed dimensions [J].
Arya, S ;
Mount, DM ;
Netanyahu, NS ;
Silverman, R ;
Wu, AY .
JOURNAL OF THE ACM, 1998, 45 (06) :891-923
[9]  
Asur S., 2010, Proceedings 2010 IEEE/ACM International Conference on Web Intelligence-Intelligent Agent Technology (WI-IAT), P492, DOI 10.1109/WI-IAT.2010.63
[10]   The Big Data Newsvendor: Practical Insights from Machine Learning [J].
Ban, Gah-Yi ;
Rudin, Cynthia .
OPERATIONS RESEARCH, 2019, 67 (01) :90-108