A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引:0
|
作者
Guo, Dongsheng [1 ]
Yue, Aizhen [1 ]
Ning, Fanggang [2 ]
Huang, Dengrong [1 ]
Chang, Bingxin [1 ]
Duan, Qiang [1 ]
Zhang, Lianchao [2 ]
Chen, Zhaoliang [2 ]
Zhang, Zheng [1 ]
Zhan, Enhao [1 ]
Zhang, Qilai [1 ]
Jiang, Kai [1 ]
Li, Rui [1 ]
Zhao, Shaoxiang [2 ]
Wei, Zizhong [1 ]
机构
[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG | 2023年
关键词
Archival research and compilation; Automatic method; Large language models; Fine-tuning;
D O I
10.1109/ICKG59574.2023.00012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.
引用
收藏
页码:52 / 59
页数:8
相关论文
共 50 条
  • [41] Investigating large language models capabilities for automatic code repair in Python']Python
    Omari, Safwan
    Basnet, Kshitiz
    Wardat, Mohammad
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (08): : 10717 - 10731
  • [42] Adapting Large Language Models for Automatic Annotation of Radiology Reports for Metastases Detection
    Barabadi, Maede Ashofteh
    Chan, Wai Yip
    Zhu, Xiaodan
    Simpson, Amber L.
    Do, Richard K. G.
    2024 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING, CCECE 2024, 2024, : 340 - 345
  • [43] LLM-Commentator: Novel fine-tuning strategies of large language models for automatic commentary generation using football event data
    Cook, Alec
    Karakul, Oktay
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [44] Instruction-Tuned Large-Language Models for Quality Control in Automatic Item Generation: A Feasibility Study
    Gorgun, Guher
    Bulut, Okan
    EDUCATIONAL MEASUREMENT-ISSUES AND PRACTICE, 2025, 44 (01) : 96 - 107
  • [45] Repeatability of Fine-Tuning Large Language Models Illustrated Using QLoRA
    Alahmari, Saeed S.
    Hall, Lawrence O.
    Mouton, Peter R.
    Goldgof, Dmitry B.
    IEEE ACCESS, 2024, 12 : 153221 - 153231
  • [46] Agile Project Management Using Large Language Models
    Dhruva, G.
    Shettigar, Ishaan
    Parthasarthy, Srikrshna
    Sapna, V. M.
    2024 5TH INTERNATIONAL CONFERENCE ON INNOVATIVE TRENDS IN INFORMATION TECHNOLOGY, ICITIIT 2024, 2024,
  • [47] Using large language models to create narrative events
    Bartalesi, Valentina
    Lenzi, Emanuele
    De Martino, Claudio
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [48] Corporate Event Predictions Using Large Language Models
    Xiao, Zhaomin
    Mai, Zhelu
    Xu, Zhuoer
    Cui, Yachen
    Li, Jiancheng
    2023 10TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE, ISCMI, 2023, : 193 - 197
  • [49] Using Large Language Models to Understand Telecom Standards
    Karapantelakis, Athanasios
    Thakur, Mukesh
    Nikou, Alexandros
    Moradi, Farnaz
    Olrog, Christian
    Gaim, Fitsum
    Holm, Henrik
    Nimara, Doumitrou Daniil
    Huang, Vincent
    2024 IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING FOR COMMUNICATION AND NETWORKING, ICMLCN 2024, 2024, : 440 - 446
  • [50] Cyber Threat Hunting Using Large Language Models
    Tanksale, Vinayak
    PROCEEDINGS OF NINTH INTERNATIONAL CONGRESS ON INFORMATION AND COMMUNICATION TECHNOLOGY, VOL 5, ICICT 2024, 2024, 1000 : 629 - 641