Multi-Agent Temporal-Difference Learning with Linear Function Approximation: Weak Convergence under Time-Varying Network Topologies

被引:0
作者
Stankovic, Milos S. [1 ]
Stankovic, Srdjan S. [2 ,3 ]
机构
[1] Univ Belgrade, Innovat Ctr, Sch Elect Engn, Belgrade, Serbia
[2] Univ Belgrade, Sch Elect Engn, Belgrade, Serbia
[3] Vlatacom Inst, Belgrade, Serbia
来源
2016 AMERICAN CONTROL CONFERENCE (ACC) | 2016年
关键词
STOCHASTIC-APPROXIMATION; CONSENSUS; OPTIMIZATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper we propose two novel distributed algorithms for iterative multi-agent off-policy linear value function approximation in Markov decision processes. The algorithms do not require any fusion center and are based on incorporating consensus-based collaborations between the agents over time-varying communication networks into recently proposed single-agent algorithms. The resulting distributed algorithms allow the agents to have different behavior policies while evaluating the response to a single target policy, using the same linear parametrization of the value function. Under appropriate assumptions on the time-varying network topology and the overall state-visiting distributions of the agents we prove for both algorithms weak convergence of the parameter estimates to a consensus point determined by an associated ODE. By a proper design of the network parameters and/or topology, this point can be tuned to coincide with the globally optimal point. The properties and the effectiveness of the proposed algorithms are illustrated on an example.
引用
收藏
页码:167 / 172
页数:6
相关论文
共 50 条
  • [21] Average consensus in multi-agent systems with uncertain topologies and multiple time-varying delays
    Shang, Yilun
    LINEAR ALGEBRA AND ITS APPLICATIONS, 2014, 459 : 411 - 429
  • [22] Consensus Analysis for Linear Multi-agent Systems with Time-varying Delays
    Zhang, Fen
    Li, Zhi
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 8281 - 8286
  • [23] Cooperative output regulation of heterogeneous linear multi-agent systems with edge-event triggered adaptive control under time-varying topologies
    Zhang, Juan
    Zhang, Huaguang
    Lu, Yanzheng
    Sun, Shaoxin
    NEURAL COMPUTING & APPLICATIONS, 2020, 32 (19) : 15573 - 15584
  • [24] Distributed resource allocation via multi-agent systems under time-varying networks
    Lu, Kaihong
    Xu, Hang
    Zheng, Yuanshi
    AUTOMATICA, 2022, 136
  • [25] Convergence Rate Analysis for Discrete-Time Multi-Agent Systems with Time-Varying Delays
    Chen Yao
    Ho, Daniel W. C.
    Lu Jinhu
    Lin Zongli
    PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, : 4578 - 4583
  • [26] Formation Control for Nonlinear Multi-agent Systems with Diverse Time-Varying Delays and Uncertain Topologies
    Luo, Hefu
    Peng, Shiguo
    2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 1730 - 1736
  • [27] Adaptive synchronization of linear multi-agent systems with time-varying multiple delays
    Petrillo, Alberto
    Salvi, Alessandro
    Santini, Stefania
    Valente, Antonio Saverio
    JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2017, 354 (18): : 8586 - 8605
  • [28] Average dwell-time conditions for consensus of discrete-time linear multi-agent systems with switching topologies and time-varying delays
    Ge, Yan-Rong
    Chen, Yang-Zhou
    Zhang, Ya-Xiao
    Zidonghua Xuebao/Acta Automatica Sinica, 2014, 40 (11): : 2609 - 2617
  • [29] Synchronization of Multi-Agent Systems under Time-Varying Network via Time-Delay Approach to Averaging
    Caiazzo, Bianca
    Fridman, Emilia
    Petrillo, Alberto
    Santini, Stefania
    IFAC PAPERSONLINE, 2022, 55 (36): : 133 - 138
  • [30] Consensus of Multi-agent Systems under a Class of Randomly Time-Varying Networks
    Liu, Kexin
    Duan, Zhisheng
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 8096 - 8100