Shaping robot behavior using principles from instrumental conditioning

被引：39

作者：

Saksida, LM ^{[1
]}

Raymond, SM

Touretzky, DS

机构：

[1] Carnegie Mellon Univ, Inst Robot, Pittsburgh, PA 15213 USA

[2] Carnegie Mellon Univ, Dept Comp Sci, Pittsburgh, PA 15213 USA

[3] Carnegie Mellon Univ, Ctr Neural Basis Cognit, Pittsburgh, PA 15213 USA

来源：

ROBOTICS AND AUTONOMOUS SYSTEMS | 1997年 / 22卷 / 3-4期

基金：

美国国家科学基金会;

关键词：

autonomous mobile robots; instrumental learning; operant conditioning; reinforcement learning; shaping;

D O I：

10.1016/S0921-8890(97)00041-9

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Shaping by successive approximations is an important animal training technique in which behavior is gradually adjusted in response to strategically timed reinforcements. We describe a computational model of this shaping process and its implementation on a mobile robot. Innate behaviors in our model are sequences of actions and enabling conditions, and shaping is a behavior editing process realized by multiple editing mechanisms. The model replicates some fundamental phenomena associated with instrumental learning in animals, and allows an RWI B21 robot to learn several distinct tasks derived from the same innate behavior. Copyright (C) 1997 Elseiver Science B.V.

引用

页码：231 / 249

页数：19

共 39 条

[1] Purposive behavior acquisition for a real robot by vision-based reinforcement learning
Asada, M
Noda, S
Tawaratsumida, S
Hosoda, K
[J]. MACHINE LEARNING, 1996, 23 (2-3) : 279 - 303
[2] BARNETT SA, 1981, MODERN ETHOLOGY
[3] BAXTER DA, 1991, NEURAL NETWORK MODEL, P13
[4] BLUMBERG BM, 1996, P 4 INT C SIM AD BEH
[5] BLUMBERG BM, 1994, P 3 INT C SIM AD BEH
[6] THE MISBEHAVIOR OF ORGANISMS
BRELAND, K
BRELAND, M
[J]. AMERICAN PSYCHOLOGIST, 1961, 16 (11) : 681 - 684
[7] AUTO-SHAPING OF PIGEONS KEY-PECK
BROWN, PL
JENKINS, HM
[J]. JOURNAL OF THE EXPERIMENTAL ANALYSIS OF BEHAVIOR, 1968, 11 (01) : 1 - &
[8] BUSSEY TJ, 1994, NEUROSCI RES COMMUN, V15, P103
[9] *CCI, 1995, CCI PROGR
[10] Christopher John Cornish Hellaby Watkins, 1989, LEARNING DELAYED REW

← 1 2 3 4 →