Developing prompts from large language model for extracting clinical information from pathology and ultrasound reports in breast cancer

被引：58

作者：

Choi, Hyeon Seok ^{[1
]}

Song, Jun Yeong ^{[1
]}

Shin, Kyung Hwan ^{[1
,2
]}

Chang, Ji Hyun ^{[1
]}

Jang, Bum-Sup ^{[1
,3
]}

机构：

[1] Seoul Natl Univ, Seoul Natl Univ Hosp, Coll Med, Dept Radiat Oncol, Seoul, South Korea

[2] Seoul Natl Univ, Med Res Ctr, Inst Radiat Med, Seoul, South Korea

[3] Seoul Natl Univ Hosp, Dept Radiat Oncol, 101 Daehak Ro, Seoul 03080, South Korea

来源：

RADIATION ONCOLOGY JOURNAL | 2023年 / 41卷 / 03期

关键词：

Automatic data processing; Artificial intelligence; Natural language processing; Breast cancer; Clinical reports; KOREA;

D O I：

10.3857/roj.2023.00633

中图分类号：

R73 [肿瘤学];

学科分类号：

100214 ;

摘要：

Purpose:We aimed to evaluate the time and cost of developing prompts using large language model (LLM), tailored to extract clinical factors in breast cancer patients and their accuracy. Materials and Methods:We collected data from reports of surgical pathology and ultrasound from breast cancer patients who underwent radiotherapy from 2020 to 2022. We extracted the information using the Generative Pre-trained Transformer (GPT) for Sheets and Docs extension plugin and termed this the "LLM" method. The time and cost of developing the prompts with LLM methods were assessed and compared with those spent on collecting information with "full manual" and "LLM-assisted manual" methods. To assess accuracy, 340 patients were randomly selected, and the extracted information by LLM method were compared with those collected by "full manual" method. Results:Data from 2,931 patients were collected. We developed 12 prompts for Extract function and 12 for Format function to extract and standardize the information. The overall accuracy was 87.7%. For lymphovascular invasion, it was 98.2%. Developing and processing the prompts took 3.5 hours and 15 minutes, respectively. Utilizing the ChatGPT application programming interface cost US $65.8 and when factoring in the estimated wage, the total cost was US $95.4. In an estimated comparison, "LLM-assisted manual" and "LLM" methods were time-and cost-efficient compared to the "full manual" method. Conclusion:Developing and facilitating prompts for LLM to derive clinical factors was efficient to extract crucial information from huge medical records. This study demonstrated the potential of the application of natural language processing using LLM model in breast cancer patients. Prompts from the current study can be re-used for other research to collect clinical information.

引用

页码：209 / 216

页数：8

共 19 条

[1]

[Anonymous], 2023, NCCN clinical practice guidelines in oncology: Survivorship

[2] Clinical Natural Language Processing for Radiation Oncology: A Review and Practical Primer [J].

Bitterman, Danielle S. ;

Miller, Timothy A. ;

Mak, Raymond H. ;

Savova, Guergana K. .

INTERNATIONAL JOURNAL OF RADIATION ONCOLOGY BIOLOGY PHYSICS, 2021, 110 (03) :641-655

[3] Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification [J].

Clavie, Benjamin ;

Ciceu, Alexandru ;

Naylor, Frederick ;

Soulie, Guillaume ;

Brightwell, Thomas .

NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2023, 2023, 13913 :3-17

[4] GPT-3: Its Nature, Scope, Limits, and Consequences [J].

Floridi, Luciano ;

Chiriatti, Massimo .

MINDS AND MACHINES, 2020, 30 (04) :681-694

[5] Artificial intelligence approaches using natural language processing to advance EHR-based clinical research [J].

Juhn, Young ;

Liu, Hongfang .

JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2020, 145 (02) :463-469

[6] Identification of Risk Factors for Locoregional Recurrence in Breast Cancer Patients with Nodal Stage N0 and N1: Who Could Benefit from Post-Mastectomy Radiotherapy? [J].

Jwa, Eunjin ;

Shin, Kyung Hwan ;

Lim, Hyeon Woo ;

Jung, So-Youn ;

Lee, Seeyoun ;

Kang, Han-Sung ;

Lee, EunSook ;

Park, Young Hee .

PLOS ONE, 2015, 10 (12)

[7]

Kung TH., 2022, PLOS DIGIT HEALTH, DOI [DOI 10.1371/JOURNAL.PDIG.0000198, DOI 10.1101/2022.12.19.22283643]

[8] Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine [J].

Lee, Peter ;

Bubeck, Sebastien ;

Petro, Joseph .

NEW ENGLAND JOURNAL OF MEDICINE, 2023, 388 (13) :1233-1239

[9]

Minimum Wage Commission Republic of Korea, 2022, Announcement of the Results of the Healthcare Workforce Status Survey

[10]

OpenAI, 2023, Arxiv, DOI [arXiv:2303.08774, DOI 10.48550/ARXIV.2303.08774, 10.48550/arXiv.2303.08774, 10.48550/arxiv.2303.08774]

← 1 2 →