THE STRONG LAW OF LARGE NUMBERS FOR SEQUENTIAL DECISIONS UNDER UNCERTAINTY

被引：47

作者：

ALGOET, PH ^{[1
]}

机构：

[1] IBM CORP,ALMADEN RES CTR,SAN JOSE,CA 95114

来源：

IEEE TRANSACTIONS ON INFORMATION THEORY | 1994年 / 40卷 / 03期

基金：

美国国家科学基金会;

关键词：

SEQUENTIAL DECISIONS; ASYMPTOTIC OPTIMALITY; ERGODIC THEORY; LEARNING FROM EXPERIENCE; UNIVERSAL DECISION SCHEMES; PATTERN CLASSIFICATION;

D O I：

10.1109/18.335876

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We combine optimization and ergodic theory to characterize the optimum long-run average performance that can be asymptotically attained by nonanticipating sequential decisions. Let {X(t)} be a stationary ergodic process, and suppose an action b(t) must be selected in a space B with knowledge of the t-past (X0,...,X(t-1)) at the beginning of every period t greater-than-or-equal-to 0. Action b(t) will incur a loss l(b(t), X(t)) at the end of period t when the random variable X(t) is revealed. We prove under mild integrability conditions that the optimum strategy is to select actions that minimize the conditional expected loss given the currently available information at each step. The minimum long-run average loss per decision can be approached arbitrarily closely by strategies that are finite-order Markov, and under certain continuity conditions, it is equal to the minimum expected loss given the infinite past. If the loss l(b, x) is bounded and continuous and if the space B is compact, then the minimum can be asymptotically attained, even if the distribution of the process {X(t)} is unknown a priori and must be learned from experience.

引用

页码：609 / 633

页数：25

共 108 条

[1] UNIVERSAL SCHEMES FOR PREDICTION, GAMBLING AND PORTFOLIO SELECTION [J].