A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引:0
|
作者
Guo, Dongsheng [1 ]
Yue, Aizhen [1 ]
Ning, Fanggang [2 ]
Huang, Dengrong [1 ]
Chang, Bingxin [1 ]
Duan, Qiang [1 ]
Zhang, Lianchao [2 ]
Chen, Zhaoliang [2 ]
Zhang, Zheng [1 ]
Zhan, Enhao [1 ]
Zhang, Qilai [1 ]
Jiang, Kai [1 ]
Li, Rui [1 ]
Zhao, Shaoxiang [2 ]
Wei, Zizhong [1 ]
机构
[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG | 2023年
关键词
Archival research and compilation; Automatic method; Large language models; Fine-tuning;
D O I
10.1109/ICKG59574.2023.00012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.
引用
收藏
页码:52 / 59
页数:8
相关论文
共 50 条
  • [31] An empirical study on the effectiveness of large language models for SATD identification and classification
    Sheikhaei, Mohammad Sadegh
    Tian, Yuan
    Wang, Shaowei
    Xu, Bowen
    EMPIRICAL SOFTWARE ENGINEERING, 2024, 29 (06)
  • [32] Leveraging Large Language Models for Enhanced Classification and Analysis: Fire Incidents Case Study
    Alkhammash, Eman H.
    FIRE-SWITZERLAND, 2025, 8 (01):
  • [33] Making Large Language Models More Reliable and Beneficial: Taking ChatGPT as a Case Study
    Majeed, Abdul
    Hwang, Seong Oun
    COMPUTER, 2024, 57 (03) : 101 - 106
  • [34] A case study of fairness in generated images of Large Language Models for Software Engineering tasks
    Sami, Mansour
    Sami, Ashkan
    Barclay, Pete
    2023 IEEE INTERNATIONAL CONFERENCE ON SOFTWARE MAINTENANCE AND EVOLUTION, ICSME, 2023, : 391 - 396
  • [35] Large Language Models and Sentiment Analysis in Financial Markets: A Review, Datasets, and Case Study
    Liu, Chenghao
    Arulappan, Arunkumar
    Naha, Ranesh
    Mahanti, Aniket
    Kamruzzaman, Joarder
    Ra, In-Ho
    IEEE ACCESS, 2024, 12 : 134041 - 134061
  • [36] Do Large Language Models Show Decision Heuristics Similar to Humans? A Case Study Using GPT-3.5
    Suri, Gaurav
    Slater, Lily R.
    Ziaee, Ali
    Nguyen, Morgan
    JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 2024, 153 (04) : 1066 - 1075
  • [37] Unleashing the power of large language models specific for haemophilia research
    Castaldoni, Rodrigo
    Ferreira-Martins, Andre Juan
    Nogueira, Tatiane
    Rios, Ricardo
    Lopes, Tiago Jose da Silva
    HAEMOPHILIA, 2024, 30 : 5 - 5
  • [38] How to Use Large Language Models for Empirical Legal Research
    Choi, Jonathan H.
    JOURNAL OF INSTITUTIONAL AND THEORETICAL ECONOMICS-ZEITSCHRIFT FUR DIE GESAMTE STAATSWISSENSCHAFT, 2024, 180 (02): : 214 - 233
  • [39] Why Large Language Models will (not) Kill Software Engineering Research
    Di Penta, Massimiliano
    PROCEEDINGS OF 2024 28TH INTERNATION CONFERENCE ON EVALUATION AND ASSESSMENT IN SOFTWARE ENGINEERING, EASE 2024, 2024, : 5 - 5
  • [40] Using Large Language Models to Improve Sentiment Analysis in Latvian Language
    Purvins, Pauls
    Urtans, Evalds
    Caune, Vairis
    BALTIC JOURNAL OF MODERN COMPUTING, 2024, 12 (02): : 165 - 175