共 54 条
- [41] Auer P., Cesa-Bianchi N., Freund Y., Schapire R.E., The nonstochastic multiarmed bandit problem, SIAM J Comput, 32, 1, pp. 48-77, (2002)
- [42] Kawazoe Aguilera M., Chen W., Toueg S., Heartbeat: a timeout-free failure detector for quiescent reliable communication. In: Distributed algorithms: 11th international workshop, WDAG’97 Saarbrücken, Germany, September 24–26, 1997 proceedings 11. Springer, 126–140, (1997)
- [43] Berlo B., Saeed A., Ozcelebi T., Towards federated unsupervised representation learning, Proceedings of the third ACM international workshop on edge systems, analytics and networking, pp. 31-36, (2020)
- [44] Xu J., Palanisamy B., Wang Q., Resilient stream processing in edge computing, 2021 IEEE/ACM 21st international symposium on Cluster, Cloud and Internet Computing (CCGrid). IEEE, pp. 504-513, (2021)
- [45] Puthiya Parambath S.A., Anagnostopoulos C., Murray-Smith R., Sequential query prediction based on multi-armed bandits with ensemble of transformer experts and immediate feedback, Data Min Knowl Disc, 38, 6, pp. 3758-3782, (2024)
- [46] Puthiya Parambath S.A., Al-Fahad S.A.M., Anagnostopoulos C., Kolomvatsos K (2024) Sequential Block Elimination for Dynamic Pricing, The 2nd international workshop on data mining in finance (DMF 2024) at the IEEE international conference on data mining, Abu Dhabi, United Arab Emirates, 09–12, (2024)
- [47] Lewi Y., Kaplan H., Mansour Y., Thompson sampling for adversarial bit prediction, Algorithmic learning theory. PMLR, pp. 518-553, (2020)
- [48] Sutton R.S., Barto A.G., Reinforcement learning: an introduction, (1998)
- [49] Mei J., Zhong Z., Dai B., Agarwal A., Szepesvari C., Schuurmans D., Stochastic gradient succeeds for bandits, International conference on machine learning. PMLR, pp. 24325-24360, (2023)
- [50] Heliou A., Mertikopoulos P., Zhou Z., Gradient-free online learning in continuous games with delayed rewards, International conference on machine learning. PMLR, pp. 4172-4181, (2020)