Multi-Agent Temporal-Difference Learning with Linear Function Approximation: Weak Convergence under Time-Varying Network Topologies

被引：0

作者：

Stankovic, Milos S. ^{[1
]}

Stankovic, Srdjan S. ^{[2
,3
]}

机构：

[1] Univ Belgrade, Innovat Ctr, Sch Elect Engn, Belgrade, Serbia

[2] Univ Belgrade, Sch Elect Engn, Belgrade, Serbia

[3] Vlatacom Inst, Belgrade, Serbia

来源：

2016 AMERICAN CONTROL CONFERENCE (ACC) | 2016年

关键词：

STOCHASTIC-APPROXIMATION; CONSENSUS; OPTIMIZATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper we propose two novel distributed algorithms for iterative multi-agent off-policy linear value function approximation in Markov decision processes. The algorithms do not require any fusion center and are based on incorporating consensus-based collaborations between the agents over time-varying communication networks into recently proposed single-agent algorithms. The resulting distributed algorithms allow the agents to have different behavior policies while evaluating the response to a single target policy, using the same linear parametrization of the value function. Under appropriate assumptions on the time-varying network topology and the overall state-visiting distributions of the agents we prove for both algorithms weak convergence of the parameter estimates to a consensus point determined by an associated ODE. By a proper design of the network parameters and/or topology, this point can be tuned to coincide with the globally optimal point. The properties and the effectiveness of the proposed algorithms are illustrated on an example.

引用

页码：167 / 172

页数：6

共 50 条

[21] Consensus Analysis for Linear Multi-agent Systems with Time-varying Delays [J].

Zhang, Fen ;

Li, Zhi .

PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, :8281-8286

[22] Average consensus in multi-agent systems with uncertain topologies and multiple time-varying delays [J].

Shang, Yilun .

LINEAR ALGEBRA AND ITS APPLICATIONS, 2014, 459 :411-429

[23] Cooperative output regulation of heterogeneous linear multi-agent systems with edge-event triggered adaptive control under time-varying topologies [J].

Zhang, Juan ;

Zhang, Huaguang ;

Lu, Yanzheng ;

Sun, Shaoxin .

NEURAL COMPUTING & APPLICATIONS, 2020, 32 (19) :15573-15584

[24] Distributed resource allocation via multi-agent systems under time-varying networks [J].

Lu, Kaihong ;

Xu, Hang ;

Zheng, Yuanshi .

AUTOMATICA, 2022, 136

[25] Convergence Rate Analysis for Discrete-Time Multi-Agent Systems with Time-Varying Delays [J].

Chen Yao ;

Ho, Daniel W. C. ;

Lu Jinhu ;

Lin Zongli .

PROCEEDINGS OF THE 29TH CHINESE CONTROL CONFERENCE, 2010, :4578-4583

[26] Formation Control for Nonlinear Multi-agent Systems with Diverse Time-Varying Delays and Uncertain Topologies [J].

Luo, Hefu ;

Peng, Shiguo .

2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, :1730-1736

[27] Adaptive synchronization of linear multi-agent systems with time-varying multiple delays [J].

Petrillo, Alberto ;

Salvi, Alessandro ;

Santini, Stefania ;

Valente, Antonio Saverio .

JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 2017, 354 (18) :8586-8605

[28] Consensus of linear time-varying multi-agent systems with a variable number of nodes [J].

Ji, Xiaolei ;

Hao, Fei .

JOURNAL OF THE FRANKLIN INSTITUTE, 2025, 362 (09)

[29] Average dwell-time conditions for consensus of discrete-time linear multi-agent systems with switching topologies and time-varying delays [J].

Ge, Yan-Rong ;

Chen, Yang-Zhou ;

Zhang, Ya-Xiao .

Zidonghua Xuebao/Acta Automatica Sinica, 2014, 40 (11) :2609-2617

[30] Synchronization of Multi-Agent Systems under Time-Varying Network via Time-Delay Approach to Averaging [J].

Caiazzo, Bianca ;

Fridman, Emilia ;

Petrillo, Alberto ;

Santini, Stefania .

IFAC PAPERSONLINE, 2022, 55 (36) :133-138

← 1 2 3 4 5 →