Leveraging Large Language Models for Efficient Failure Analysis in Game Development

被引：0

作者：

Marini, Leonardo ^{[1
]}

Gisslen, Linus ^{[2
]}

Sestini, Alessandro ^{[2
]}

机构：

[1] Frostbite, Stockholm, Sweden

[2] SEED Elect Arts EA, Redwood City, CA USA

来源：

2024 IEEE CONFERENCE ON GAMES, COG 2024 | 2024年

关键词：

Natural language processing; Validation; Tracing; Games; Software Quality; Software development;

D O I：

10.1109/CoG60054.2024.10645540

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In games, and more generally in the field of software development, early detection of bugs is vital to maintain a high quality of the final product. Automated tests are a powerful tool that can catch a problem earlier in development by executing periodically. As an example, when new code is submitted to the code base, a new automated test verifies these changes. However, identifying the specific change responsible for a test failure becomes harder when dealing with batches of changes especially in the case of a large-scale project such as a AAA game, where thousands of people contribute to a single code base. This paper proposes a new approach to automatically identify which change in the code caused a test to fail. The method leverages Large Language Models (LLMs) to associate error messages with the corresponding code changes causing the failure. We investigate the effectiveness of our approach with quantitative and qualitative evaluations. Our approach reaches an accuracy of 71% in our newly created dataset, which comprises issues reported by developers at EA over a period of one year. We further evaluated our model through a user study to assess the utility and usability of the tool from a developer perspective, resulting in a significant reduction in time - up to 60% - spent investigating issues.

引用

页数：8

共 25 条

[1] Achiam O. J., 2023, Gpt-4 technical report
[2] Ahmad W. U., 2021, Unified pretraining for program understanding and generation
[3] Clement CB, 2020, Arxiv, DOI arXiv:2010.03150
[4] Castelluccio M., 2019, Teaching machines to triage firefox bugs
[5] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[6] Feng ZY, 2020, Arxiv, DOI [arXiv:2002.08155, 10.48550/arXiv.2002.08155]
[7] Gegick Michael, 2010, Proceedings of the 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), P11, DOI 10.1109/MSR.2010.5463340
[8] Gong L, 2014, Arxiv, DOI arXiv:1404.4100
[9] Does the fault reside in a stack trace? Assisting crash localization by predicting crashing fault residence
Gu, Yongfeng
Xuan, Jifeng
Zhang, Hongyu
Zhang, Lanxin
Fan, Qingna
Xie, Xiaoyuan
Qian, Tieyun
[J]. JOURNAL OF SYSTEMS AND SOFTWARE, 2019, 148 : 88 - 104
[10] Huang Y., arXiv

← 1 2 3 →