Multi-Agent Temporal-Difference Learning with Linear Function Approximation: Weak Convergence under Time-Varying Network Topologies

被引：0

作者：

Stankovic, Milos S. ^{[1
]}

Stankovic, Srdjan S. ^{[2
,3
]}

机构：

[1] Univ Belgrade, Innovat Ctr, Sch Elect Engn, Belgrade, Serbia

[2] Univ Belgrade, Sch Elect Engn, Belgrade, Serbia

[3] Vlatacom Inst, Belgrade, Serbia

来源：

2016 AMERICAN CONTROL CONFERENCE (ACC) | 2016年

关键词：

STOCHASTIC-APPROXIMATION; CONSENSUS; OPTIMIZATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper we propose two novel distributed algorithms for iterative multi-agent off-policy linear value function approximation in Markov decision processes. The algorithms do not require any fusion center and are based on incorporating consensus-based collaborations between the agents over time-varying communication networks into recently proposed single-agent algorithms. The resulting distributed algorithms allow the agents to have different behavior policies while evaluating the response to a single target policy, using the same linear parametrization of the value function. Under appropriate assumptions on the time-varying network topology and the overall state-visiting distributions of the agents we prove for both algorithms weak convergence of the parameter estimates to a consensus point determined by an associated ODE. By a proper design of the network parameters and/or topology, this point can be tuned to coincide with the globally optimal point. The properties and the effectiveness of the proposed algorithms are illustrated on an example.

引用

页码：167 / 172

页数：6

共 50 条

[31] Time-varying formation control for double-integrator multi-agent systems with jointly connected topologies
Dong, Xiwang
Han, Liang
Li, Qingdong
Ren, Zhang
[J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2016, 47 (16) : 3829 - 3838
[32] Consensus Analysis in Multi-agent Systems with Non-uniform Time-varying Delays and Uncertain Topologies
Subbarao, Kamesh
Bhusal, Rajnish
[J]. IFAC PAPERSONLINE, 2022, 55 (36): : 139 - 144
[33] Containment analysis and design for general linear multi-agent systems with time-varying delays
Dong, Xiwang
Han, Liang
Li, Qingdong
Chen, Jian
Ren, Zhang
[J]. NEUROCOMPUTING, 2016, 173 : 2062 - 2068
[34] Time-varying formation control for linear multi-agent systems with distributed adaptive protocols
Wang, Rui
Dong, Xiwang
Li, Qingdong
Ren, Zhang
[J]. PROCEEDINGS OF THE 28TH CHINESE CONTROL AND DECISION CONFERENCE (2016 CCDC), 2016, : 1332 - 1337
[35] Distributed adaptive control for time-varying formation of general linear multi-agent systems
Wang, Rui
Dong, Xiwang
Li, Qingdong
Ren, Zhang
[J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2017, 48 (16) : 3491 - 3503
[36] Time-varying formation feasibility analysis for linear multi-agent systems with time delays and switching graphs
Dong, Xiwang
Hua, Yongzhao
Hu, Guoqiang
Ren, Zhang
[J]. PROCEEDINGS OF THE 38TH CHINESE CONTROL CONFERENCE (CCC), 2019, : 6263 - 6268
[37] Leader-following consensus criteria for multi-agent systems with time-varying delays and switching interconnection topologies
M.J.Park
O.M.Kwon
Ju H.Park
S.M.Lee
E.J.Cha
[J]. Chinese Physics B, 2012, 21 (11) : 146 - 155
[38] Time-Varying Group Formation Control for Multi-Agent Systems with Second-Order Dynamics and Directed Topologies
Dong, Xiwang
Li, Qingdong
Zhao, Qilun
Ken, Zhang
[J]. PROCEEDINGS OF THE 2016 12TH WORLD CONGRESS ON INTELLIGENT CONTROL AND AUTOMATION (WCICA), 2016, : 350 - 355
[39] Consensus tracking control for time-varying delayed linear multi-agent systems under relative state saturation constraints
Zanganeh, Javad
Hosseini Sani, Seyed Kamal
Pariz, Naser
[J]. TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2023,
[40] Time-varying group formation analysis and design for second-order multi-agent systems with directed topologies
Dong, Xiwang
Li, Qingdong
Zhao, Qilun
Ren, Zhang
[J]. NEUROCOMPUTING, 2016, 205 : 367 - 374

← 1 2 3 4 5 →