LLM-Commentator: Novel fine-tuning strategies of large language models for automatic commentary generation using football event data

被引:8
作者
Cook, Alec [1 ]
Karakul, Oktay [1 ]
机构
[1] Cardiff Univ, Sch Comp Sci & Informat, Senghennydd Rd, Cardiff CF24 4AG, Wales
关键词
Large language models; Natural language processing; Football; Commentary generation; Fine-tuning;
D O I
10.1016/j.knosys.2024.112219
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-time commentary on football matches is a challenging task that requires precise and coherent descriptions of events as they unfold. Traditional methods often fall short in providing timely and accurate insights into the game. This study aims to explore the utilisation of innovative Large language model (LLM) techniques to develop an adept language model - dubbed LLM-Commentator - that can generate (near-) real-time commentary on football matches. The goal is to demonstrate that open-source language models, when finetuned with domain-specific data on consumer-grade hardware, can accurately depict football events from raw match data. Three distinct training strategies are employed to fine-tune the language models, addressing various challenges encountered in generating real-time football commentary. The study evaluates the efficacy of these models in producing coherent and accurate descriptions of unseen football events. Among the three strategies proposed, the Mixed Immediately Model emerges as particularly efficient in learning and adeptly handling challenging workloads. This suggests a promising future for simultaneous multi-task learning with compact, open-source language models in the context of real-time sports commentary. Additionally, the study highlights the practicality of utilising consumer-grade hardware for fine-tuning language models with specialised knowledge. The findings underscore the importance of customising training approaches and ensuring well-balanced datasets when fine-tuning language models for specific tasks. Moreover, they serve as a practical guide for broader accessibility to large language models and significantly contribute to the application of NLP in sports journalism, enabling more insightful and engaging real-time commentary on football matches.
引用
收藏
页数:22
相关论文
共 89 条
[1]  
Aggarwal C., 2018, Neural Networks and Deep Learning, DOI 10.1007/978-3-319-94463-0
[2]   Social media content strategy for sport clubs to drive fan engagement [J].
Annamalai, Balamurugan ;
Yoshida, Masayuki ;
Varshney, Sanjeev ;
Pathak, Atul Arun ;
Venugopal, Pingali .
JOURNAL OF RETAILING AND CONSUMER SERVICES, 2021, 62
[3]  
[Anonymous], 2019, Preferred Networks Migrates its Deep Learning Research Platform to PyTorch
[4]  
[Anonymous], 2023, A. Cook
[5]  
Asada M, 2000, AI MAG, V21, P9
[6]  
Beeching E., 2023, Finetuning 20B LLMs with RLHF on a 24GB consumer GPU
[7]  
Belkada Y., 2023, Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA
[8]  
Binsted K, 1999, LECT NOTES ARTIF INT, V1604, P22
[9]  
Bondarenko Yelysei, 2021, arXiv, DOI DOI 10.48550/ARXIV.2109.12948
[10]  
Bradley M., 2023, Tech. Rep.