Automatic Text Classification With Large Language Models: A Review of <monospace>openai</monospace> for Zero- and Few-Shot Classification

被引:0
|
作者
Anglin, Kylie L. [1 ]
Ventura, Claudia [1 ]
机构
[1] Univ Connecticut, Storrs, CT 06269 USA
关键词
large language models; LLMs; artificial intelligence; <monospace>openai</monospace>; educational measurement;
D O I
10.3102/10769986241279927
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
While natural language documents, such as intervention transcripts and participant writing samples, can provide highly nuanced insights into educational and psychological constructs, researchers often find these materials difficult and expensive to analyze. Recent developments in machine learning, however, have allowed social scientists to harness the power of artificial intelligence for complex data categorization tasks. One approach, supervised learning, supports high-performance categorization yet still requires a large, hand-labeled training corpus, which can be costly. An alternative approach-zero- and few-shot classification with pretrained large language models-offers a cheaper, compelling alternative. This article considers the application of zero-shot and few-shot classification in educational research. We provide an overview of large language models, a step-by-step tutorial on using the Python openai package for zero-shot and few-shot classification, and a discussion of relevant research considerations for social scientists.<br />
引用
收藏
页数:23
相关论文
共 30 条
  • [21] Context Unlocks Emotions: Text-based Emotion Classification Dataset Auditing with Large Language Models
    Yang, Daniel
    Kommineni, Aditya
    Alshehri, Mohammad
    Mohanty, Nilamadhab
    Modi, Vedant
    Gratch, Jonathan
    Narayanan, Shrikanth
    2023 11TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION, ACII, 2023,
  • [22] Near-real-time Seismic Human Fatality Information Retrieval from Social Media with Few-shot Large-Language Models
    Hou, James
    Xu, Susu
    PROCEEDINGS OF THE TWENTIETH ACM CONFERENCE ON EMBEDDED NETWORKED SENSOR SYSTEMS, SENSYS 2022, 2022, : 1141 - 1147
  • [23] Exploiting Large Language Models for Enhanced Review Classification Explanations Through Interpretable and Multidimensional Analysis
    Cosentino, Cristian
    Gunduz-Cure, Merve
    Marozzo, Fabrizio
    Ozturk-Birim, Sule
    DISCOVERY SCIENCE, DS 2024, PT I, 2025, 15243 : 3 - 18
  • [24] Evaluating large language models for health-related text classification tasks with public social media data
    Guo, Yuting
    Ovadje, Anthony
    Al-Garadi, Mohammed Ali
    Sarker, Abeed
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (10) : 2181 - 2189
  • [25] Comparing human text classification performance and explainability with large language and machine learning models using eye-tracking
    Venkatesh, Jeevithashree Divya
    Jaiswal, Aparajita
    Nanda, Gaurav
    SCIENTIFIC REPORTS, 2024, 14 (01):
  • [26] Research on fine-tuning strategies for text classification in the aquaculture domain by combining deep learning and large language models
    Zhenglin Li
    Sijia Zhang
    Peirong Cao
    Jiaqi Zhang
    Zongshi An
    Aquaculture International, 2025, 33 (4)
  • [27] Using KullBack-Liebler Divergence Based Meta-learning Algorithm for Few-Shot Skin Cancer Image Classification: Literature Review and a Conceptual Framework
    Akinrinade, Olusoji B.
    Du, Chunglin
    Ajila, Samuel
    ADVANCES IN COMPUTING AND DATA SCIENCES (ICACDS 2022), PT II, 2022, 1614 : 100 - 111
  • [28] EduDCM: A Novel Framework for Automatic Educational Dialogue Classification Dataset Construction via Distant Supervision and Large Language Models
    Qi, Changyong
    Zheng, Longwei
    Wei, Yuang
    Xu, Haoxin
    Chen, Peiji
    Gu, Xiaoqing
    APPLIED SCIENCES-BASEL, 2025, 15 (01):
  • [29] A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports
    Sushil, Madhumita
    Zack, Travis
    Mandair, Divneet
    Zheng, Zhiwei
    Wali, Ahmed
    Yu, Yan-Ning
    Quan, Yuwei
    Lituiev, Dmytro
    Butte, Atul J.
    JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2024, 31 (10) : 2315 - 2327
  • [30] The added value of including thyroid nodule features into large language models for automatic ACR TI-RADS classification based on ultrasound reports
    Lopez-ubeda, Pilar
    Martin-Noguerol, Teodoro
    Ruiz-Vinuesa, Alba
    Luna, Antonio
    JAPANESE JOURNAL OF RADIOLOGY, 2025, 43 (04) : 593 - 602