Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification

被引:28
作者
Clavie, Benjamin [1 ]
Ciceu, Alexandru [2 ]
Naylor, Frederick [1 ]
Soulie, Guillaume [1 ]
Brightwell, Thomas [1 ]
机构
[1] Bright Network, Edinburgh, Midlothian, Scotland
[2] Silicon Grove, Edinburgh, Midlothian, Scotland
来源
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2023 | 2023年 / 13913卷
关键词
Large Language Models; Text Classification; Natural Language Processing; Industrial Applications; Prompt Engineering;
D O I
10.1007/978-3-031-35320-8_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This case study investigates the task of job classification in a real-world setting, where the goal is to determine whether an English-language is appropriate for a graduate or entry-level position. We explore multiple approaches to text classification, including supervised approaches such as traditional models like Support Vector Machines (SVMs) and state-of-the-art deep learning methods such as DeBERTa. We compare them with Large Language Models (LLMs) used in both few-shot and zero-shot classification settings. To accomplish this task, we employ prompt engineering, a technique that involves designing prompts to guide the LLMs towards the desired output. Specifically, we evaluate the performance of two commercially available state-of-the-art GPT-3.5-based language models, text-davinci-003 and gpt-3.5-turbo. We also conduct a detailed analysis of the impact of different aspects of prompt engineering on the model's performance. Our results show that, with a well-designed prompt, a zero-shot gpt-3.5-turboclassifier outperforms all other models, achieving a 6% increase in Precision@95% Recall compared to the best supervised approach. Furthermore, we observe that the wording of the prompt is a critical factor in eliciting the appropriate "reasoning" in the model, and that seemingly minor aspects of the prompt significantly affect the model's performance.
引用
收藏
页码:3 / 17
页数:15
相关论文
共 35 条
[1]  
Anders G., 2021, LinkedIn Economic Graph Research
[2]  
Bommasani R, 2021, arXiv, DOI [DOI 10.48550/ARXIV.2108.07258, 10.48550/arXiv.2108.07258]
[3]   WoLMIS: a labor market intelligence system for classifying web job vacancies [J].
Boselli, Roberto ;
Cesarini, Mirko ;
Marrara, Stefania ;
Mercorio, Fabio ;
Mezzanzanica, Mario ;
Pasi, Gabriella ;
Viviani, Marco .
JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2018, 51 (03) :477-502
[4]  
Brown TB, 2020, ADV NEUR IN, V33
[5]   The Unreasonable Effectiveness of the Baseline: Discussing SVMs in Legal Text Classification [J].
Clavie, Benjamin ;
Alphonsus, Marc .
LEGAL KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 346 :58-61
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]  
Diakopoulos N., 2023, Medium
[8]   Job matching and propagation [J].
Fujita, Shigeru ;
Ramey, Garey .
JOURNAL OF ECONOMIC DYNAMICS & CONTROL, 2007, 31 (11) :3671-3698
[9]  
Gao LY, 2023, Arxiv, DOI [arXiv:2211.10435, 10.48550/arXiv.2211.10435]
[10]   The changing graduate labour market: analysis using a new indicator of graduate jobs [J].
Green, Francis ;
Henseke, Golo .
IZA JOURNAL OF LABOR POLICY, 2016, 5