Enhancing Clinical Trial Summarization: Leveraging Large Language Models and Knowledge Graphs for Entity Preservation

被引：0

作者：

Nahed, Pouyan ^{[1
]}

Kambar, Mina Esmail Zadeh Nojoo ^{[1
]}

Taghva, Kazem ^{[1
]}

机构：

[1] Univ Nevada, Dept Comp Sci, Las Vegas, NV 89154 USA

来源：

PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, ICICT 2024, VOL 7 | 2024年 / 1003卷

基金：

美国国家科学基金会;

关键词：

Large language models; Clinical data; Summarization; Named entity preservation; Knowledge graph;

D O I：

10.1007/978-981-97-3302-6_26

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

ClinicalTrials.gov is an accessible online medical resource for researchers, healthcare professionals, and policy designers seeking detailed information on clinical trials. Summarizing these long clinical records can significantly reduce the time needed for the database users as the process transforms comprehensive information into concise synopses, preserving the essential meaning and facilitating understanding. In this paper, we employ the Bidirectional and Auto-Regressive Transformers model to generate the trials' brief summaries. Our contributions provide new preprocessing techniques for model training, which leads to a robust summarization model. The fine-tuned model significantly enhanced ROUGE-1, ROUGE-2, and ROUGEL F1-scores by 14%, 23%, and 20%, respectively, compared to previous studies. Additionally, we present an innovative knowledge graph based on entity classes to assess the generated summaries. This graph not only quantifies the essential entities transformed from the original text to the summaries but also provides insights into their specific order and arrangement in sentences.

引用

页码：325 / 336

页数：12

共 20 条

[1] Akdemir A, 2020, CLEF working notes
[2] Automatic Text Summarization of Biomedical Text Data: A Systematic Review
Chaves, Andrea
Kesiku, Cyrille
Garcia-Zapirain, Begonya
[J]. INFORMATION, 2022, 13 (08)
[3] Multi-Task Learning for Abstractive and Extractive Summarization
Chen, Yangbin
Ma, Yun
Mao, Xudong
Li, Qing
[J]. DATA SCIENCE AND ENGINEERING, 2019, 4 (01) : 14 - 23
[4] CUMMINGS J, 2022, TRANSL RES CLIN INTE, V8, DOI DOI 10.1002/TRC2.12295
[5] DeYoung J., 2021, arXiv
[6] Biomedical-domain pre-trained language model for extractive summarization
Du, Yongping
Li, Qingxiao
Wang, Lulin
He, Yanqing
[J]. KNOWLEDGE-BASED SYSTEMS, 2020, 199 (199)
[7] Esmaeilzadeh A, 2021, Lecture notes in networks and systems, V296, P175
[8] Extractive summarization of clinical trial descriptions
Gulden, Christian
Kirchner, Melanie
Schuettler, Christina
Hinderer, Marc
Kampf, Marvin
Prokosch, Hans-Ulrich
Toddenroth, Dennis
[J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2019, 129 : 114 - 121
[9] Overview and Importance of Data Quality for Machine Learning Tasks
Jain, Abhinav
Patel, Hima
Nagalapatti, Lokesh
Gupta, Nitin
Mehta, Sameep
Guttula, Shanmukha
Mujumdar, Shashank
Afzal, Shazia
Mittal, Ruhi Sharma
Munigala, Vitobha
[J]. KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2020, : 3561 - 3562
[10] Kwon W, 2023, P ACM SIGOPS 29 S OP

← 1 2 →