Automatic Text Classification With Large Language Models: A Review of <monospace>openai</monospace> for Zero- and Few-Shot Classification

被引：0

作者：

Anglin, Kylie L. ^{[1
]}

Ventura, Claudia ^{[1
]}

机构：

[1] Univ Connecticut, Storrs, CT 06269 USA

来源：

JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICS | 2024年

关键词：

large language models; LLMs; artificial intelligence; <monospace>openai</monospace>; educational measurement;

D O I：

10.3102/10769986241279927

中图分类号：

G40 [教育学];

学科分类号：

040101 ; 120403 ;

摘要：

While natural language documents, such as intervention transcripts and participant writing samples, can provide highly nuanced insights into educational and psychological constructs, researchers often find these materials difficult and expensive to analyze. Recent developments in machine learning, however, have allowed social scientists to harness the power of artificial intelligence for complex data categorization tasks. One approach, supervised learning, supports high-performance categorization yet still requires a large, hand-labeled training corpus, which can be costly. An alternative approach-zero- and few-shot classification with pretrained large language models-offers a cheaper, compelling alternative. This article considers the application of zero-shot and few-shot classification in educational research. We provide an overview of large language models, a step-by-step tutorial on using the Python openai package for zero-shot and few-shot classification, and a discussion of relevant research considerations for social scientists.<br />

引用

页数：23

共 30 条

[21] Context Unlocks Emotions: Text-based Emotion Classification Dataset Auditing with Large Language Models
Yang, Daniel
Kommineni, Aditya
Alshehri, Mohammad
Mohanty, Nilamadhab
Modi, Vedant
Gratch, Jonathan
Narayanan, Shrikanth
2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, ACII, 2023,
[22] Near-real-time Seismic Human Fatality Information Retrieval from Social Media with Few-shot Large-Language Models
Hou, James
Xu, Susu
PROCEEDINGS OF THE TWENTIETH ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2022, 2022, : 1141 - 1147
[23] Exploiting Large Language Models for Enhanced Review Classification Explanations Through Interpretable and Multidimensional Analysis
Cosentino, Cristian
Gunduz-Cure, Merve
Marozzo, Fabrizio
Ozturk-Birim, Sule
DISCOVERY SCIENCE, DS 2024, PT I, 2025, 15243 : 3 - 18
[24] Evaluating large language models for health-related text classification tasks with public social media data
Guo, Yuting
Ovadje, Anthony
Al-Garadi, Mohammed Ali
Sarker, Abeed
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (10) : 2181 - 2189
[25] Comparing human text classification performance and explainability with large language and machine learning models using eye-tracking
Venkatesh, Jeevithashree Divya
Jaiswal, Aparajita
Nanda, Gaurav
SCIENTIFIC REPORTS, 2024, 14 (01):
[26] Research on fine-tuning strategies for text classification in the aquaculture domain by combining deep learning and large language models
Zhenglin Li
Sijia Zhang
Peirong Cao
Jiaqi Zhang
Zongshi An
Aquaculture International, 2025, 33 (4)
[27] Using KullBack-Liebler Divergence Based Meta-learning Algorithm for Few-Shot Skin Cancer Image Classification: Literature Review and a Conceptual Framework
Akinrinade, Olusoji B.
Du, Chunglin
Ajila, Samuel
ADVANCES IN COMPUTING AND DATA SCIENCES (ICACDS 2022), PT II, 2022, 1614 : 100 - 111
[28] EduDCM: A Novel Framework for Automatic Educational Dialogue Classification Dataset Construction via Distant Supervision and Large Language Models
Qi, Changyong
Zheng, Longwei
Wei, Yuang
Xu, Haoxin
Chen, Peiji
Gu, Xiaoqing
APPLIED SCIENCES-BASEL, 2025, 15 (01):
[29] A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports
Sushil, Madhumita
Zack, Travis
Mandair, Divneet
Zheng, Zhiwei
Wali, Ahmed
Yu, Yan-Ning
Quan, Yuwei
Lituiev, Dmytro
Butte, Atul J.
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (10) : 2315 - 2327
[30] The added value of including thyroid nodule features into large language models for automatic ACR TI-RADS classification based on ultrasound reports
Lopez-ubeda, Pilar
Martin-Noguerol, Teodoro
Ruiz-Vinuesa, Alba
Luna, Antonio
JAPANESE JOURNAL OF RADIOLOGY, 2025, 43 (04) : 593 - 602

← 1 2 3 →