Evaluating the capabilities of large language models using machine learning tasks at inference-time

被引：0

作者：

Grm, Klemen ^{[1
]}

机构：

[1] Univ Ljubljani, Fak Elektrotehniko, Trzaska Cesta 25, Ljubljana 1000, Slovenia

来源：

ELEKTROTEHNISKI VESTNIK | 2023年 / 90卷 / 05期

关键词：

language models; machine learning; evaluation methodology;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Machine learning is the domain of algorithms capable of learning from data to improve their performance on a task or set of tasks. Common machine learning tasks include classification, regression, and generative modelling. The most common modern example of machine learners in practical use is deep neural networks coupled with an extrinsic optimizer such as stochastic gradient descent. Recently, scaled-up large language models have shown increasing capabilities of in-context meta-learning, which has been used to improve their performance on language tasks through few-shot learning. In this paper, we show that pre-trained large language models can act as machine learners with regard to in-context data, without using extrinsic optimization tools or weight updates. By evaluating the language models' inference time machine learning abilities on synthetic or appropriately transformed datasets, we conclusively show that they're able to model complex relationships between data in the input context. This implies that inference-time machine learning tasks represent a meaningful capability evaluation task for large language models.

引用

页码：247 / 253

页数：7

共 50 条

[1] Evaluating Coding Proficiency of Large Language Models: An Investigation Through Machine Learning Problems
Ko, Eunbi
Kang, Pilsung
IEEE ACCESS, 2025, 13 : 52925 - 52938
[2] Evaluating different configurations of machine learning models and their transfer learning capabilities for stress detection using heart rate
Albaladejo-González M.
Ruipérez-Valiente J.A.
Gómez Mármol F.
Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (08) : 11011 - 11021
[3] Identification of Scientific Texts Generated by Large Language Models Using Machine Learning
Soto-Osorio, David
Sidorov, Grigori
Chanona-Hernandez, Liliana
Lopez-Ramirez, Blanca Cecilia
COMPUTERS, 2024, 13 (12)
[4] Automated Research Review Support Using Machine Learning, Large Language Models, and Natural Language Processing
Pendyala, Vishnu S.
Kamdar, Karnavee
Mulchandani, Kapil
ELECTRONICS, 2025, 14 (02):
[5] Evaluating Plant Gene Models Using Machine Learning
Upadhyaya, Shriprabha R.
Bayer, Philipp E.
Tay Fernandez, Cassandria G.
Petereit, Jakob
Batley, Jacqueline
Bennamoun, Mohammed
Boussaid, Farid
Edwards, David
PLANTS-BASEL, 2022, 11 (12):
[6] Using Machine Learning to Evaluate and Enhance Models of Probabilistic Inference
Gloeckner, Andreas
Jekel, Marc
Lisovoj, Daria
DECISION-WASHINGTON, 2024, 11 (04): : 633 - 651
[7] Navigating WebAI: Training Agents to CompleteWeb Tasks with Large Language Models and Reinforcement Learning
Thil, Lucas-Andrei
Popa, Mirela
Spanakis, Gerasimos
39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 866 - 874
[8] Can Large Language Models Replace Therapists? Evaluating Performance at Simple Cognitive Behavioral Therapy Tasks
Hodson, Nathan
Williamson, Simon
JMIR AI, 2024, 3
[9] SonicParanoid2: fast, accurate, and comprehensive orthology inference with machine learning and language models
Cosentino, Salvatore
Sriswasdi, Sira
Iwasaki, Wataru
GENOME BIOLOGY, 2024, 25 (01):
[10] Integrating Machine Learning and Large Language Models to Advance Exploration of Electrochemical Reactions
Zheng, Zhiling
Florit, Federico
Jin, Brooke
Wu, Haoyang
Li, Shih-Cheng
Nandiwale, Kakasaheb Y.
Salazar, Chase A.
Mustakis, Jason G.
Green, William H.
Jensen, Klavs F.
ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2025, 64 (06)

← 1 2 3 4 5 →