A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引：0

作者：

Guo, Dongsheng ^{[1
]}

Yue, Aizhen ^{[1
]}

Ning, Fanggang ^{[2
]}

Huang, Dengrong ^{[1
]}

Chang, Bingxin ^{[1
]}

Duan, Qiang ^{[1
]}

Zhang, Lianchao ^{[2
]}

Chen, Zhaoliang ^{[2
]}

Zhang, Zheng ^{[1
]}

Zhan, Enhao ^{[1
]}

Zhang, Qilai ^{[1
]}

Jiang, Kai ^{[1
]}

Li, Rui ^{[1
]}

Zhao, Shaoxiang ^{[2
]}

Wei, Zizhong ^{[1
]}

机构：

[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China

[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG | 2023年

关键词：

Archival research and compilation; Automatic method; Large language models; Fine-tuning;

D O I：

10.1109/ICKG59574.2023.00012

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.

引用

页码：52 / 59

页数：8

共 50 条

[21] Adopting Pre-trained Large Language Models for Regional Language Tasks: A Case Study
Gaikwad, Harsha
Kiwelekar, Arvind
Laddha, Manjushree
Shahare, Shashank
INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2023, PT I, 2024, 14531 : 15 - 25
[22] Automated Research Review Support Using Machine Learning, Large Language Models, and Natural Language Processing
Pendyala, Vishnu S.
Kamdar, Karnavee
Mulchandani, Kapil
ELECTRONICS, 2025, 14 (02):
[23] Using Large Language Models to Generate JUnit Tests: An Empirical Study
Siddiq, Mohammed Latif
Santos, Joanna C. S.
Tanvir, Ridwanul Hasan
Ulfat, Noshin
Al Rifat, Fahmid
Lopes, Vinicius Carvalho
PROCEEDINGS OF 2024 28TH INTERNATION CONFERENCE ON EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING, EASE 2024, 2024, : 313 - 322
[24] Unveiling the Impact of Large Language Models on Student Learning: A Comprehensive Case Study
Zdravkova, Katerina
Dalipi, Fisnik
Ahlgren, Fredrik
Ilijoski, Bojan
Ohlsson, Tobias
2024 IEEE GLOBAL ENGINEERING EDUCATION CONFERENCE, EDUCON 2024, 2024,
[25] Large language models as tax attorneys: a case study in legal capabilities emergence
Nay, John J.
Karamardian, David
Lawsky, Sarah B.
Tao, Wenting
Bhat, Meghana
Jain, Raghav
Lee, Aaron Travis
Choi, Jonathan H.
Kasai, Jungo
PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2024, 382 (2270):
[26] Novel applications of large language models in clinical research
Abers, Michael S.
Mathias, Rasika A.
JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2025, 155 (03) : 813 - 814
[27] Large Language Models for Few-Shot Automatic Term Extraction
Banerjee, Shubhanker
Chakravarthi, Bharathi Raja
McCrae, John Philip
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 137 - 150
[28] Automatic readability assessment for sentences: neural, hybrid and large language models
Liu, Fengkai
Jin, Tan
Lee, John S. Y.
LANGUAGE RESOURCES AND EVALUATION, 2025,
[29] Using Large Language Models in Business Processes
Grisold, Thomas
vom Brocke, Jan
Kratsch, Wolfgang
Mendling, Jan
Vidgof, Maxim
BUSINESS PROCESS MANAGEMENT, BPM 2023, 2023, 14159 : XXIX - XXXI
[30] Accelerating Pharmacovigilance using Large Language Models
Prakash, Mukkamala Venkata Sai
Parab, Ganesh
Veeramalla, Meghana
Reddy, Siddartha
Varun, V.
Gopalakrishnan, Saisubramaniam
Pagidipally, Vishal
Vaddina, Vishal
PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1182 - 1183

← 1 2 3 4 5 →