Decentralized Multi-Agent Bandit Learning for Intelligent Internet of Things Systems

被引:2
作者
Leng, Qiuyu [1 ,2 ,3 ]
Wang, Shangshang [1 ]
Huang, Xi [4 ]
Shao, Ziyu [1 ]
Yang, Yang [1 ]
机构
[1] ShanghaiTech Univ, Sch Informat Sci & Technol, Shanghai, Peoples R China
[2] Chinese Acad Sci, Shanghai Inst Microsyst & Informat Technol, Shanghai, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
[4] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen, Peoples R China
来源
2022 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC) | 2022年
关键词
Intelligent Internet of Things systems; data heterogeneity; multi-agent bandit learning; IOT;
D O I
10.1109/WCNC51071.2022.9771884
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In intelligent Internet of Things systems, data-hungry services are empowered by data collection, which is jointly accomplished by edge servers and data-collecting sensors. In this paper, we aim to achieve efficient data collection, i.e., maximize data rates from sensors to servers while mitigating the impact of data heterogeneity for data collected from sensors. Considering geographically distributed servers and sensors, we study the problem from the perspective of multi-agent multi-armed bandits. The key ideas of our approach are to 1) establish associations between servers and sensors under unknown wireless dynamics (i.e., channel state information) and selection fraction constraints; 2) utilize shared information via pairwise communication between servers to mitigate biased observations for data rates. To this end, we propose a scheme that leverages online learning to reduce uncertainties in wireless dynamics and online control to mitigate the impact of data heterogeneity. Based on an effective integration of bandit learning methods under pairwise communication and Lyapunov optimization techniques, we present a novel Decentralized sErver-Sensor association scheme with Multi-Agent learning under pairwise communication (DESMA). Our theoretical analysis demonstrates that DESMA achieves a tunable trade-off between maximizing data rate and mitigating the impact of data heterogeneity.
引用
收藏
页码:2118 / 2123
页数:6
相关论文
共 16 条
[1]  
[Anonymous], 2018, Construction and Analysis of a Large Scale Image Ontology
[2]  
Audibert J.-Y., 2009, PROC ANN C LEARNING
[3]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[4]  
Chawla R., 2020, P AISTATS
[5]  
Grammenos Andreas, 2018, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, V2, DOI 10.1145/3191743
[6]  
Hu F, 2013, INTELLIGENT SENSOR NETWORKS: THE INTEGRATION OF SENSOR NETWORKS, SIGNAL PROCESSING AND MACHINE LEARNING, P1
[7]   Adaptive data rate control in low power wide area networks for long range IoT services [J].
Kim, Dae-Young ;
Kim, Seokhoon ;
Hassan, Houcine ;
Park, Jong Hyuk .
JOURNAL OF COMPUTATIONAL SCIENCE, 2017, 22 :171-178
[8]  
Leng Q., 2021, DECENTRALIZED MULTI
[9]   Combinatorial Sleeping Bandits With Fairness Constraints [J].
Li, Fengjiao ;
Liu, Jia ;
Ji, Bo .
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2020, 7 (03) :1799-1813
[10]   Learning IoT in Edge: Deep Learning for the Internet of Things with Edge Computing [J].
Li, He ;
Ota, Kaoru ;
Dong, Mianxiong .
IEEE NETWORK, 2018, 32 (01) :96-101