Evaluating the capabilities of large language models using machine learning tasks at inference-time

被引:0
|
作者
Grm, Klemen [1 ]
机构
[1] Univ Ljubljani, Fak Elektrotehniko, Trzaska Cesta 25, Ljubljana 1000, Slovenia
来源
ELEKTROTEHNISKI VESTNIK | 2023年 / 90卷 / 05期
关键词
language models; machine learning; evaluation methodology;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Machine learning is the domain of algorithms capable of learning from data to improve their performance on a task or set of tasks. Common machine learning tasks include classification, regression, and generative modelling. The most common modern example of machine learners in practical use is deep neural networks coupled with an extrinsic optimizer such as stochastic gradient descent. Recently, scaled-up large language models have shown increasing capabilities of in-context meta-learning, which has been used to improve their performance on language tasks through few-shot learning. In this paper, we show that pre-trained large language models can act as machine learners with regard to in-context data, without using extrinsic optimization tools or weight updates. By evaluating the language models' inference time machine learning abilities on synthetic or appropriately transformed datasets, we conclusively show that they're able to model complex relationships between data in the input context. This implies that inference-time machine learning tasks represent a meaningful capability evaluation task for large language models.
引用
收藏
页码:247 / 253
页数:7
相关论文
共 50 条
  • [21] PRIMAL: Power Inference using Machine Learning
    Zhou, Yuan
    Ren, Haoxing
    Zhang, Yanqing
    Keller, Ben
    Khailany, Brucek
    Zhang, Zhiru
    PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
  • [22] Evaluating Landslide Susceptibility Using Sampling Methodology and Multiple Machine Learning Models
    Song, Yingze
    Yang, Degang
    Wu, Weicheng
    Zhang, Xin
    Zhou, Jie
    Tian, Zhaoxu
    Wang, Chencan
    Song, Yingxu
    ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2023, 12 (05)
  • [23] Machine learning models for classification tasks related to drug safety
    Anita Rácz
    Dávid Bajusz
    Ramón Alain Miranda-Quintana
    Károly Héberger
    Molecular Diversity, 2021, 25 : 1409 - 1424
  • [24] Democratizing Language Learning using Machine Learning
    Gangopadhyay, Ahana
    Bardhan, Indrajit
    Das, Anirban
    Soman, Nitish Subhash
    Das, Santanu
    2022 56TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2022, : 137 - 141
  • [25] Detection of Arabic offensive language in social media using machine learning models
    Mousa, Aya
    Shahin, Ismail
    Nassif, Ali Bou
    Elnagar, Ashraf
    INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
  • [26] Machine learning models for classification tasks related to drug safety
    Racz, Anita
    Bajusz, David
    Miranda-Quintana, Ramon Alain
    Heberger, Karoly
    MOLECULAR DIVERSITY, 2021, 25 (03) : 1409 - 1424
  • [27] Standardization of Featureless Variables for Machine Learning Models Using Natural Language Processing
    Modarresi, Kourosh
    Munir, Abdurrahman
    COMPUTATIONAL SCIENCE - ICCS 2018, PT II, 2018, 10861 : 234 - 246
  • [28] Sentiment analysis on digital transformation announcements with dictionary method, machine learning, and large language models
    Huang, Cheng-Kui
    Kuo, Chien-Jen
    2024 INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS, AND COMMUNICATIONS, ITC-CSCC 2024, 2024,
  • [29] Stylometry-driven framework for Urdu intrinsic plagiarism detection: a comprehensive analysis using machine learning, deep learning, and large language models
    Manzoor, Muhammad Faraz
    Farooq, Muhammad Shoaib
    Abid, Adnan
    Neural Computing and Applications, 2025, 37 (09) : 6479 - 6513
  • [30] Evaluating Machine Learning Models for Essential Protein Identification
    Costa, Jessica da Silva
    Rodrigues, Jorge Gabriel
    Belloze, Kele
    ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2022, 2022, 13523 : 38 - 43