Node selection using adversarial expert-based multi-armed bandits in distributed computing

被引:0
作者
Alfahad, Saleh [1 ]
Parambath, Shameem Puthiya [1 ]
Anagnostopoulos, Christos [1 ]
Kolomvatsos, Kostas [2 ]
机构
[1] Univ Glasgow, Sch Comp Sci, Glasgow City, Scotland
[2] Univ Thessaly, Dept Informat & Telecommun, Lamia, Greece
关键词
Edge computing; Node selection; Multi-armed bandits; Non-stochastic bandits; EDGE; CLOUD; NETWORKS; SYSTEMS;
D O I
10.1007/s00607-025-01443-w
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The edge computing (EC) paradigm enhances the Quality of Service of distributed computing applications by bringing computation closer to data sources, such as sensors, IoT devices, and local servers, instead of relying solely on centralized data centers (e.g., the Cloud). In EC environments, node selection refers to the problem of determining which distributed computing nodes should be selected for performing computing tasks taking into consideration the heterogeneity of factors like limited resources, network context, and node's computational capabilities. Evidently, node selection affects the efficiency and performance of EC environments. Recent node selection strategies rely on either heuristic or optimization methods, which inherently assume static environments. However, distributed environments consist of highly heterogeneous and dynamic systems. Addressing such a dynamic nature requires node selection strategies that leverage real-time feedback information. In this paper, we propose sequential learning-based algorithms based on multi-armed bandit (MAB) systems to deal with the node selection problem. Unlike previous MAB approaches, we contribute novel MAB algorithms for node selection using deep learning expert models. To tackle the inherent uncertainty associated with nodes, we introduce ExpGradBand, a novel expert-based gradient MAB algorithm, which leverages the selection efficiency of gradient bandits with the historic contextual information. Furthermore, we evaluate and compare ExpGradBand with various MAB approaches and baselines found in the literature with and without contextual information. Our evaluation study includes comprehensive experiments that assess the performance of these methods in settings with delayed or lost contextual feedback.
引用
收藏
页数:25
相关论文
共 54 条
  • [41] Auer P., Cesa-Bianchi N., Freund Y., Schapire R.E., The nonstochastic multiarmed bandit problem, SIAM J Comput, 32, 1, pp. 48-77, (2002)
  • [42] Kawazoe Aguilera M., Chen W., Toueg S., Heartbeat: a timeout-free failure detector for quiescent reliable communication. In: Distributed algorithms: 11th international workshop, WDAG’97 Saarbrücken, Germany, September 24–26, 1997 proceedings 11. Springer, 126–140, (1997)
  • [43] Berlo B., Saeed A., Ozcelebi T., Towards federated unsupervised representation learning, Proceedings of the third ACM international workshop on edge systems, analytics and networking, pp. 31-36, (2020)
  • [44] Xu J., Palanisamy B., Wang Q., Resilient stream processing in edge computing, 2021 IEEE/ACM 21st international symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, pp. 504-513, (2021)
  • [45] Puthiya Parambath S.A., Anagnostopoulos C., Murray-Smith R., Sequential query prediction based on multi-armed bandits with ensemble of transformer experts and immediate feedback, Data Min Knowl Disc, 38, 6, pp. 3758-3782, (2024)
  • [46] Puthiya Parambath S.A., Al-Fahad S.A.M., Anagnostopoulos C., Kolomvatsos K (2024) Sequential Block Elimination for Dynamic Pricing, The 2nd international workshop on data mining in finance (DMF 2024) at the IEEE international conference on data mining, Abu Dhabi, United Arab Emirates, 09–12, (2024)
  • [47] Lewi Y., Kaplan H., Mansour Y., Thompson sampling for adversarial bit prediction, Algorithmic learning theory. PMLR, pp. 518-553, (2020)
  • [48] Sutton R.S., Barto A.G., Reinforcement learning: an introduction, (1998)
  • [49] Mei J., Zhong Z., Dai B., Agarwal A., Szepesvari C., Schuurmans D., Stochastic gradient succeeds for bandits, International conference on machine learning. PMLR, pp. 24325-24360, (2023)
  • [50] Heliou A., Mertikopoulos P., Zhou Z., Gradient-free online learning in continuous games with delayed rewards, International conference on machine learning. PMLR, pp. 4172-4181, (2020)