Evaluating the capabilities of large language models using machine learning tasks at inference-time

被引：0

作者：

Grm, Klemen ^{[1
]}

机构：

[1] Univ Ljubljani, Fak Elektrotehniko, Trzaska Cesta 25, Ljubljana 1000, Slovenia

来源：

ELEKTROTEHNISKI VESTNIK | 2023年 / 90卷 / 05期

关键词：

language models; machine learning; evaluation methodology;

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Machine learning is the domain of algorithms capable of learning from data to improve their performance on a task or set of tasks. Common machine learning tasks include classification, regression, and generative modelling. The most common modern example of machine learners in practical use is deep neural networks coupled with an extrinsic optimizer such as stochastic gradient descent. Recently, scaled-up large language models have shown increasing capabilities of in-context meta-learning, which has been used to improve their performance on language tasks through few-shot learning. In this paper, we show that pre-trained large language models can act as machine learners with regard to in-context data, without using extrinsic optimization tools or weight updates. By evaluating the language models' inference time machine learning abilities on synthetic or appropriately transformed datasets, we conclusively show that they're able to model complex relationships between data in the input context. This implies that inference-time machine learning tasks represent a meaningful capability evaluation task for large language models.

引用

页码：247 / 253

页数：7

共 50 条

[21] PRIMAL: Power Inference using Machine Learning
Zhou, Yuan
Ren, Haoxing
Zhang, Yanqing
Keller, Ben
Khailany, Brucek
Zhang, Zhiru
PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
[22] Evaluating Landslide Susceptibility Using Sampling Methodology and Multiple Machine Learning Models
Song, Yingze
Yang, Degang
Wu, Weicheng
Zhang, Xin
Zhou, Jie
Tian, Zhaoxu
Wang, Chencan
Song, Yingxu
ISPRS INTERNATIONAL JOURNAL OF GEO-INFORMATION, 2023, 12 (05)
[23] Machine learning models for classification tasks related to drug safety
Anita Rácz
Dávid Bajusz
Ramón Alain Miranda-Quintana
Károly Héberger
Molecular Diversity, 2021, 25 : 1409 - 1424
[24] Democratizing Language Learning using Machine Learning
Gangopadhyay, Ahana
Bardhan, Indrajit
Das, Anirban
Soman, Nitish Subhash
Das, Santanu
2022 56TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2022, : 137 - 141
[25] Detection of Arabic offensive language in social media using machine learning models
Mousa, Aya
Shahin, Ismail
Nassif, Ali Bou
Elnagar, Ashraf
INTELLIGENT SYSTEMS WITH APPLICATIONS, 2024, 22
[26] Machine learning models for classification tasks related to drug safety
Racz, Anita
Bajusz, David
Miranda-Quintana, Ramon Alain
Heberger, Karoly
MOLECULAR DIVERSITY, 2021, 25 (03) : 1409 - 1424
[27] Standardization of Featureless Variables for Machine Learning Models Using Natural Language Processing
Modarresi, Kourosh
Munir, Abdurrahman
COMPUTATIONAL SCIENCE - ICCS 2018, PT II, 2018, 10861 : 234 - 246
[28] Sentiment analysis on digital transformation announcements with dictionary method, machine learning, and large language models
Huang, Cheng-Kui
Kuo, Chien-Jen
2024 INTERNATIONAL TECHNICAL CONFERENCE ON CIRCUITS/SYSTEMS, COMPUTERS, AND COMMUNICATIONS, ITC-CSCC 2024, 2024,
[29] Stylometry-driven framework for Urdu intrinsic plagiarism detection: a comprehensive analysis using machine learning, deep learning, and large language models
Manzoor, Muhammad Faraz
Farooq, Muhammad Shoaib
Abid, Adnan
Neural Computing and Applications, 2025, 37 (09) : 6479 - 6513
[30] Evaluating Machine Learning Models for Essential Protein Identification
Costa, Jessica da Silva
Rodrigues, Jorge Gabriel
Belloze, Kele
ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, BSB 2022, 2022, 13523 : 38 - 43

← 1 2 3 4 5 →