Path-finding in real and simulated rats:: assessing the influence of path characteristics on navigation learning

被引:8
|
作者
Tamosiunaite, Minija [2 ,3 ]
Ainge, James [3 ]
Kulvicius, Tomas [1 ,2 ]
Porr, Bernd [4 ]
Dudchenko, Paul [3 ]
Woergoetter, Florentin [1 ,3 ]
机构
[1] Univ Gottingen, Bernstein Ctr Computat Neurosci, Gottingen, Germany
[2] Vytautas Magnus Univ, Dept Informat, LT-44404 Kaunas, Lithuania
[3] Univ Stirling, Dept Psychol, Stirling FK9 4LA, Scotland
[4] Univ Glasgow, Dept Elect & Elect Engn, Glasgow GT12 8LT, Lanark, Scotland
基金
英国生物技术与生命科学研究理事会;
关键词
reinforcement learning; SARSA; place field system; function approximation; weight decay;
D O I
10.1007/s10827-008-0094-6
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A large body of experimental evidence suggests that the hippocampal place field system is involved in reward based navigation learning in rodents. Reinforcement learning (RL) mechanisms have been used to model this, associating the state space in an RL-algorithm to the place-field map in a rat. The convergence properties of RL-algorithms are affected by the exploration patterns of the learner. Therefore, we first analyzed the path characteristics of freely exploring rats in a test arena. We found that straight path segments with mean length 23 cm up to a maximal length of 80 cm take up a significant proportion of the total paths. Thus, rat paths are biased as compared to random exploration. Next we designed a RL system that reproduces these specific path characteristics. Our model arena is covered by overlapping, probabilistically firing place fields (PF) of realistic size and coverage. Because convergence of RL-algorithms is also influenced by the state space characteristics, different PF-sizes and densities, leading to a different degree of overlap, were also investigated. The model rat learns finding a reward opposite to its starting point. We observed that the combination of biased straight exploration, overlapping coverage and probabilistic firing will strongly impair the convergence of learning. When the degree of randomness in the exploration is increased, convergence improves, but the distribution of straight path segments becomes unrealistic and paths become 'wiggly'. To mend this situation without affecting the path characteristic two additional mechanisms are implemented: A gradual drop of the learned weights (weight decay) and path length limitation, which prevents learning if the reward is not found after some expected time. Both mechanisms limit the memory of the system and thereby counteract effects of getting trapped on a wrong path. When using these strategies individually divergent cases get substantially reduced and for some parameter settings no divergence was found anymore at all. Using weight decay and path length limitation at the same time, convergence is not much improved but instead time to convergence increases as the memory limiting effect is getting too strong. The degree of improvement relies also on the size and degree of overlap (coverage density) in the place field system. The used combination of these two parameters leads to a trade-off between convergence and speed to convergence. Thus, this study suggests that the role of the PF-system in navigation learning cannot be considered independently from the animals' exploration pattern.
引用
收藏
页码:562 / 582
页数:21
相关论文
共 50 条
  • [41] A bidirectional path-finding algorithm and data structure for maritime routing
    Tsatcha, Dieudonne
    Saux, Eric
    Claramunt, Christophe
    INTERNATIONAL JOURNAL OF GEOGRAPHICAL INFORMATION SCIENCE, 2014, 28 (07) : 1355 - 1377
  • [42] A critical examination of stoichiometric and path-finding approaches to metabolic pathways
    Planes, Francisco J.
    Beasley, John E.
    BRIEFINGS IN BIOINFORMATICS, 2008, 9 (05) : 422 - 436
  • [43] Field D* path-finding on weighted triangulated and tetrahedral meshes
    Perkins, Simon
    Marais, Patrick
    Gain, James
    Berman, Mark
    AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2013, 26 (03) : 354 - 388
  • [44] Fuzzy-enhanced path-finding algorithm for AGV roadmaps
    Uttendorf, Sarah
    Overmeyer, Ludger
    PROCEEDINGS OF THE 2015 CONFERENCE OF THE INTERNATIONAL FUZZY SYSTEMS ASSOCIATION AND THE EUROPEAN SOCIETY FOR FUZZY LOGIC AND TECHNOLOGY, 2015, 89 : 675 - 681
  • [45] Crystallography as a Path-Finding Tool to Understand Functionality in Coordination Polymers
    Maity, Dilip Kumar
    Ghoshal, Debajyoti
    JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2017, 97 (02) : 261 - 279
  • [46] Crystallography as a Path-Finding Tool to Understand Functionality in Coordination Polymers
    Dilip Kumar Maity
    Debajyoti Ghoshal
    Journal of the Indian Institute of Science, 2017, 97 : 261 - 279
  • [47] Field D* path-finding on weighted triangulated and tetrahedral meshes
    Simon Perkins
    Patrick Marais
    James Gain
    Mark Berman
    Autonomous Agents and Multi-Agent Systems, 2013, 26 : 354 - 388
  • [48] MINIMUM-COST SPANNING TREE AS A PATH-FINDING PROBLEM
    MAGGS, BM
    PLOTKIN, SA
    INFORMATION PROCESSING LETTERS, 1988, 26 (06) : 291 - 293
  • [49] Intelligent DTCO (iDTCO) for next generation logic path-finding
    Kwon, Uihui
    Okagaki, Takeshi
    Song, Young-seok
    Kim, Sungyeol
    Kim, Yohan
    Kim, Minkyoung
    Kim, Ah-young
    Ahn, Saetbyeol
    Shin, Jihye
    Park, Yonghee
    Kim, Jongchol
    Kim, Dae Sin
    Qi, Weiyi
    Lu, Yang
    Xu, Nuo
    Park, Hong-Hyun
    Wang, Jing
    Choi, Woosung
    2018 INTERNATIONAL CONFERENCE ON SIMULATION OF SEMICONDUCTOR PROCESSES AND DEVICES (SISPAD 2018), 2018, : 49 - 52
  • [50] EVALUATION OF A PATH-FINDING ALGORITHM FOR INTERCONNECTED LOCAL AREA NETWORKS
    WONG, JW
    VERNON, AJ
    FIELD, JA
    IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 1987, 5 (09) : 1463 - 1470