A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引:0
|
作者
Guo, Dongsheng [1 ]
Yue, Aizhen [1 ]
Ning, Fanggang [2 ]
Huang, Dengrong [1 ]
Chang, Bingxin [1 ]
Duan, Qiang [1 ]
Zhang, Lianchao [2 ]
Chen, Zhaoliang [2 ]
Zhang, Zheng [1 ]
Zhan, Enhao [1 ]
Zhang, Qilai [1 ]
Jiang, Kai [1 ]
Li, Rui [1 ]
Zhao, Shaoxiang [2 ]
Wei, Zizhong [1 ]
机构
[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG | 2023年
关键词
Archival research and compilation; Automatic method; Large language models; Fine-tuning;
D O I
10.1109/ICKG59574.2023.00012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.
引用
收藏
页码:52 / 59
页数:8
相关论文
共 50 条
  • [21] Adopting Pre-trained Large Language Models for Regional Language Tasks: A Case Study
    Gaikwad, Harsha
    Kiwelekar, Arvind
    Laddha, Manjushree
    Shahare, Shashank
    INTELLIGENT HUMAN COMPUTER INTERACTION, IHCI 2023, PT I, 2024, 14531 : 15 - 25
  • [22] Automated Research Review Support Using Machine Learning, Large Language Models, and Natural Language Processing
    Pendyala, Vishnu S.
    Kamdar, Karnavee
    Mulchandani, Kapil
    ELECTRONICS, 2025, 14 (02):
  • [23] Using Large Language Models to Generate JUnit Tests: An Empirical Study
    Siddiq, Mohammed Latif
    Santos, Joanna C. S.
    Tanvir, Ridwanul Hasan
    Ulfat, Noshin
    Al Rifat, Fahmid
    Lopes, Vinicius Carvalho
    PROCEEDINGS OF 2024 28TH INTERNATION CONFERENCE ON EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING, EASE 2024, 2024, : 313 - 322
  • [24] Unveiling the Impact of Large Language Models on Student Learning: A Comprehensive Case Study
    Zdravkova, Katerina
    Dalipi, Fisnik
    Ahlgren, Fredrik
    Ilijoski, Bojan
    Ohlsson, Tobias
    2024 IEEE GLOBAL ENGINEERING EDUCATION CONFERENCE, EDUCON 2024, 2024,
  • [25] Large language models as tax attorneys: a case study in legal capabilities emergence
    Nay, John J.
    Karamardian, David
    Lawsky, Sarah B.
    Tao, Wenting
    Bhat, Meghana
    Jain, Raghav
    Lee, Aaron Travis
    Choi, Jonathan H.
    Kasai, Jungo
    PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES, 2024, 382 (2270):
  • [26] Novel applications of large language models in clinical research
    Abers, Michael S.
    Mathias, Rasika A.
    JOURNAL OF ALLERGY AND CLINICAL IMMUNOLOGY, 2025, 155 (03) : 813 - 814
  • [27] Large Language Models for Few-Shot Automatic Term Extraction
    Banerjee, Shubhanker
    Chakravarthi, Bharathi Raja
    McCrae, John Philip
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 137 - 150
  • [28] Automatic readability assessment for sentences: neural, hybrid and large language models
    Liu, Fengkai
    Jin, Tan
    Lee, John S. Y.
    LANGUAGE RESOURCES AND EVALUATION, 2025,
  • [29] Using Large Language Models in Business Processes
    Grisold, Thomas
    vom Brocke, Jan
    Kratsch, Wolfgang
    Mendling, Jan
    Vidgof, Maxim
    BUSINESS PROCESS MANAGEMENT, BPM 2023, 2023, 14159 : XXIX - XXXI
  • [30] Accelerating Pharmacovigilance using Large Language Models
    Prakash, Mukkamala Venkata Sai
    Parab, Ganesh
    Veeramalla, Meghana
    Reddy, Siddartha
    Varun, V.
    Gopalakrishnan, Saisubramaniam
    Pagidipally, Vishal
    Vaddina, Vishal
    PROCEEDINGS OF THE 17TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING, WSDM 2024, 2024, : 1182 - 1183