Team formation through an assessor: choosing MARL agents in pursuit-evasion games

被引：0

作者：

Zhao, Yue ^{[1
]}

Ju, Lushan ^{[1
]}

Hernandez-Orallo, Jose ^{[2
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci & Engn, Dongxiang Rd, Xian 710129, Peoples R China

[2] Univ Politecn Valencia ValGRAI, Valencian Res Inst Artificial Intelligence VRAIN, Valencia 46022, Spain

来源：

COMPLEX & INTELLIGENT SYSTEMS | 2024年 / 10卷 / 03期

基金：

中国国家自然科学基金;

关键词：

Team formation; Multi-agent reinforcement learning; Pursuit-evasion Games; Multi-agent systems; INTELLIGENCE; STRATEGIES;

D O I：

10.1007/s40747-023-01336-5

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Team formation in multi-agent systems usually assumes the capabilities of each team member are known, and the best formation can be derived from that information. As AI agents become more sophisticated, this characterisation is becoming more elusive and less predictive about the performance of a team in cooperative or competitive situations. In this paper, we introduce a general and flexible way of anticipating the outcome of a game for any lineups (the agents, sociality regimes and any other hyperparameters for the team). To this purpose, we simply train an assessor using an appropriate team representation and standard machine learning techniques. We illustrate how we can interrogate the assessor to find the best formations in a pursuit-evasion game for several scenarios: offline team formation, where teams have to be decided before the game and not changed afterwards, and online team formation, where teams can see the lineups of the other teams and can be changed at any time.

引用

页码：3473 / 3492

页数：20

共 51 条

[1]

Agapiou JP, 2022, ARXIV

[2] Finding robust timetables for project presentations of student teams [J].

Akkan, Can ;

Kulunk, M. Erdem ;

Kocas, Cenk .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2016, 249 (02) :560-576

[3]

Baker B., 2020, P 34 INT C NEURAL IN, V33, P15786

[4]

Bello M., 2017, Soft Computing Applications for Group Decision-Making and Consensus Modeling, P391, DOI DOI 10.1007/978-3-319-60207-3_23

[5] Rethink reporting of evaluation results in AI Aggregate metrics and lack of access to results limit understanding [J].

Burnell, Ryan ;

Schellaert, Wout ;

Burden, John ;

Ullman, Tomer D. ;

Martinez-Plumed, Fernando ;

Tenenbaum, Joshua B. ;

Rutar, Danaja ;

Cheke, Lucy G. ;

Sohl-Dickstein, Jascha ;

Mitchell, Melanie ;

Kiela, Douwe ;

Shanahan, Murray ;

Voorhees, Ellen M. ;

Cohn, Anthony G. ;

Leibo, Joel Z. ;

Hernandez-Orallo, Jose .

SCIENCE, 2023, 380 (6641) :136-138

[6]

Chalkiadakis C., 2004, P 3 INT JOINT C AUT, V3, P1090, DOI DOI 10.1109/AAMAS.2004.74

[7] Incorporating geographical location for team formation in social coding sites [J].

Chen, Liang ;

Ye, Yongjian ;

Zheng, Angyu ;

Xie, Fenfang ;

Zheng, Zibin ;

Lyu, Michael R. .

WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2020, 23 (01) :153-174

[8] Multiplayer Reach-Avoid Games via Pairwise Outcomes [J].

Chen, Mo ;

Zhou, Zhengyuan ;

Tomlin, Claire J. .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2017, 62 (03) :1451-1457

[9] XGBoost: A Scalable Tree Boosting System [J].

Chen, Tianqi ;

Guestrin, Carlos .

KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794

[10]

Dhariwal P., 2016, Openai baselines

← 1 2 3 4 5 6 →