About Profit Sharing Considering Infatuate Actions

被引:1
作者
Uemura, Wataru [1 ]
机构
[1] Ryukoku Univ, 1-5 Seta, Otsu, Shiga, Japan
关键词
reinforcement learning; profit sharing; MDP; POMDP;
D O I
10.20965/jaciii.2009.p0615
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In reinforcement learning systems based on trial-and error, the agent, that is the subject or the system that perceives its environment and takes actions which maximize its chances of success, is rewarded when it attains the target level of learning of the learning exercise. In Profit Sharing, the reinforcement learning process is pursued for the accumulation of such rewards. In order to continue the process of reward accumulation, the agent insists upon the repetition of the particular actions that are being learned and avoids selecting other actions, making the agent less adaptable to changes in the environment. In view of the above, this paper attempts to propose the introduction of the concept of infatuation to eliminate the reluctance of the agent to adapt to new environments. If the agent is a living being, when a single particular reinforcement learning process is repeated, the stimulus the agent perceives in each of the processes gradually loses its intensity due to familiarization. However, if the agent encounters a set of rules that are different from those of the particular repeated learning process, then the agent reverts to the previous particular learning process, and the stimulus the agent receives after the said reversion recovers its intensity. The intention here is to apply the concept of assimilation infatuation to Profit Sharing, and to confirm its effects through experiments.
引用
收藏
页码:615 / 623
页数:9
相关论文
共 9 条
[1]  
Grefenstette J. J., 1988, Machine Learning, V3, P225, DOI 10.1007/BF00113898
[2]  
Kato S., 2000, PRICAI 2000. Topics in Artificial Intelligence. 6th Pacific Rim International Conference on Artificial Intelligence. Proceedings (Lecture Notes in Artificial Intelligence Vol.1886), P115
[3]  
Kato S., 2001, Transactions of the Institute of Electronics, Information and Communication Engineers D-I, VJ84D-I, P1067
[4]  
Miyazaki K., 1999, Journal of Japanese Society for Artificial Intelligence, V14, P148
[5]  
Miyazaki K., 1994, Journal of Japanese Society for Artificial Intelligence, V9, P580
[6]  
Rummery G. A, 1994, ONLINE Q LEARNING US
[7]  
Uemura W., 2004, JOINT 2 INT C SOFT C, P1
[8]  
WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698
[9]  
Whitehead S.D., 1990, P 7 INT C MACH LEARN, P162