A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引：0

作者：

Guo, Dongsheng ^{[1
]}

Yue, Aizhen ^{[1
]}

Ning, Fanggang ^{[2
]}

Huang, Dengrong ^{[1
]}

Chang, Bingxin ^{[1
]}

Duan, Qiang ^{[1
]}

Zhang, Lianchao ^{[2
]}

Chen, Zhaoliang ^{[2
]}

Zhang, Zheng ^{[1
]}

Zhan, Enhao ^{[1
]}

Zhang, Qilai ^{[1
]}

Jiang, Kai ^{[1
]}

Li, Rui ^{[1
]}

Zhao, Shaoxiang ^{[2
]}

Wei, Zizhong ^{[1
]}

机构：

[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China

[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG | 2023年

关键词：

Archival research and compilation; Automatic method; Large language models; Fine-tuning;

D O I：

10.1109/ICKG59574.2023.00012

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.

引用

页码：52 / 59

页数：8

共 50 条

[41] Investigating large language models capabilities for automatic code repair in Python']Python
Omari, Safwan
Basnet, Kshitiz
Wardat, Mohammad
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 10717 - 10731
[42] Adapting Large Language Models for Automatic Annotation of Radiology Reports for Metastases Detection
Barabadi, Maede Ashofteh
Chan, Wai Yip
Zhu, Xiaodan
Simpson, Amber L.
Do, Richard K. G.
2024 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CCECE 2024, 2024, : 340 - 345
[43] LLM-Commentator: Novel fine-tuning strategies of large language models for automatic commentary generation using football event data
Cook, Alec
Karakul, Oktay
KNOWLEDGE-BASED SYSTEMS, 2024, 300
[44] Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study
Gorgun, Guher
Bulut, Okan
EDUCATIONAL MEASUREMENT-ISSUES AND PRACTICE, 2025, 44 (01) : 96 - 107
[45] Repeatability of Fine-Tuning Large Language Models Illustrated Using QLoRA
Alahmari, Saeed S.
Hall, Lawrence O.
Mouton, Peter R.
Goldgof, Dmitry B.
IEEE ACCESS, 2024, 12 : 153221 - 153231
[46] Agile Project Management Using Large Language Models
Dhruva, G.
Shettigar, Ishaan
Parthasarthy, Srikrshna
Sapna, V. M.
2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
[47] Using large language models to create narrative events
Bartalesi, Valentina
Lenzi, Emanuele
De Martino, Claudio
PEERJ COMPUTER SCIENCE, 2024, 10
[48] Corporate Event Predictions Using Large Language Models
Xiao, Zhaomin
Mai, Zhelu
Xu, Zhuoer
Cui, Yachen
Li, Jiancheng
2023 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2023, : 193 - 197
[49] Using Large Language Models to Understand Telecom Standards
Karapantelakis, Athanasios
Thakur, Mukesh
Nikou, Alexandros
Moradi, Farnaz
Olrog, Christian
Gaim, Fitsum
Holm, Henrik
Nimara, Doumitrou Daniil
Huang, Vincent
2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 440 - 446
[50] Cyber Threat Hunting Using Large Language Models
Tanksale, Vinayak
PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 5, ICICT 2024, 2024, 1000 : 629 - 641

← 1 2 3 4 5 →