Evaluating the capabilities of large language models using machine learning tasks at inference-time

被引:0
|
作者
Grm, Klemen [1 ]
机构
[1] Univ Ljubljani, Fak Elektrotehniko, Trzaska Cesta 25, Ljubljana 1000, Slovenia
来源
ELEKTROTEHNISKI VESTNIK | 2023年 / 90卷 / 05期
关键词
language models; machine learning; evaluation methodology;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Machine learning is the domain of algorithms capable of learning from data to improve their performance on a task or set of tasks. Common machine learning tasks include classification, regression, and generative modelling. The most common modern example of machine learners in practical use is deep neural networks coupled with an extrinsic optimizer such as stochastic gradient descent. Recently, scaled-up large language models have shown increasing capabilities of in-context meta-learning, which has been used to improve their performance on language tasks through few-shot learning. In this paper, we show that pre-trained large language models can act as machine learners with regard to in-context data, without using extrinsic optimization tools or weight updates. By evaluating the language models' inference time machine learning abilities on synthetic or appropriately transformed datasets, we conclusively show that they're able to model complex relationships between data in the input context. This implies that inference-time machine learning tasks represent a meaningful capability evaluation task for large language models.
引用
收藏
页码:247 / 253
页数:7
相关论文
共 50 条
  • [1] Evaluating Coding Proficiency of Large Language Models: An Investigation Through Machine Learning Problems
    Ko, Eunbi
    Kang, Pilsung
    IEEE ACCESS, 2025, 13 : 52925 - 52938
  • [2] Evaluating different configurations of machine learning models and their transfer learning capabilities for stress detection using heart rate
    Albaladejo-González M.
    Ruipérez-Valiente J.A.
    Gómez Mármol F.
    Journal of Ambient Intelligence and Humanized Computing, 2023, 14 (08) : 11011 - 11021
  • [3] Identification of Scientific Texts Generated by Large Language Models Using Machine Learning
    Soto-Osorio, David
    Sidorov, Grigori
    Chanona-Hernandez, Liliana
    Lopez-Ramirez, Blanca Cecilia
    COMPUTERS, 2024, 13 (12)
  • [4] Automated Research Review Support Using Machine Learning, Large Language Models, and Natural Language Processing
    Pendyala, Vishnu S.
    Kamdar, Karnavee
    Mulchandani, Kapil
    ELECTRONICS, 2025, 14 (02):
  • [5] Evaluating Plant Gene Models Using Machine Learning
    Upadhyaya, Shriprabha R.
    Bayer, Philipp E.
    Tay Fernandez, Cassandria G.
    Petereit, Jakob
    Batley, Jacqueline
    Bennamoun, Mohammed
    Boussaid, Farid
    Edwards, David
    PLANTS-BASEL, 2022, 11 (12):
  • [6] Using Machine Learning to Evaluate and Enhance Models of Probabilistic Inference
    Gloeckner, Andreas
    Jekel, Marc
    Lisovoj, Daria
    DECISION-WASHINGTON, 2024, 11 (04): : 633 - 651
  • [7] Navigating WebAI: Training Agents to CompleteWeb Tasks with Large Language Models and Reinforcement Learning
    Thil, Lucas-Andrei
    Popa, Mirela
    Spanakis, Gerasimos
    39TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, SAC 2024, 2024, : 866 - 874
  • [8] Can Large Language Models Replace Therapists? Evaluating Performance at Simple Cognitive Behavioral Therapy Tasks
    Hodson, Nathan
    Williamson, Simon
    JMIR AI, 2024, 3
  • [9] SonicParanoid2: fast, accurate, and comprehensive orthology inference with machine learning and language models
    Cosentino, Salvatore
    Sriswasdi, Sira
    Iwasaki, Wataru
    GENOME BIOLOGY, 2024, 25 (01):
  • [10] Integrating Machine Learning and Large Language Models to Advance Exploration of Electrochemical Reactions
    Zheng, Zhiling
    Florit, Federico
    Jin, Brooke
    Wu, Haoyang
    Li, Shih-Cheng
    Nandiwale, Kakasaheb Y.
    Salazar, Chase A.
    Mustakis, Jason G.
    Green, William H.
    Jensen, Klavs F.
    ANGEWANDTE CHEMIE-INTERNATIONAL EDITION, 2025, 64 (06)