A Study Case of Automatic Archival Research and Compilation using Large Language Models

被引:0
|
作者
Guo, Dongsheng [1 ]
Yue, Aizhen [1 ]
Ning, Fanggang [2 ]
Huang, Dengrong [1 ]
Chang, Bingxin [1 ]
Duan, Qiang [1 ]
Zhang, Lianchao [2 ]
Chen, Zhaoliang [2 ]
Zhang, Zheng [1 ]
Zhan, Enhao [1 ]
Zhang, Qilai [1 ]
Jiang, Kai [1 ]
Li, Rui [1 ]
Zhao, Shaoxiang [2 ]
Wei, Zizhong [1 ]
机构
[1] Inspur Acad Sci & Technol, Jinan, Shandong, Peoples R China
[2] Inspur Software Co Ltd, Jinan, Shandong, Peoples R China
来源
2023 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH, ICKG | 2023年
关键词
Archival research and compilation; Automatic method; Large language models; Fine-tuning;
D O I
10.1109/ICKG59574.2023.00012
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Archival research and compilation is a specialized task that focuses on exploration, selection and processing of vast quantities of archival documents pertaining to specific subjects. Traditionally, this task has been characterized by its labor-intensive and time-consuming requirements. In recent years, the advancement of artificial intelligence has made automatic archival research and compilation tasks feasible. However, the limited availability of relevant samples imposes significant constraints on the application of deep learning models, given their high demand for sufficient data and knowledge. In this paper, we present a study case and propose an innovative method for automatic archival research and compilation, leveraging the robust knowledge base and text generation ability offered by large language models. Specifically, our method comprises three essential components: document retrieval, document summarization, and rule-based compilation. In the document summarization component, we leverage fine-tuned large language models to enhance the performance by simulation data generation and summary generation. Experimental results substantiate the effectiveness of our method. Furthermore, our method provides a general idea in using large language models, as well as a solution for addressing similar challenges in different domains.
引用
收藏
页码:52 / 59
页数:8
相关论文
共 50 条
  • [1] Fostering websites accessibility: A case study on the use of the Large Language Models ChatGPT for automatic remediation
    Othman, Achraf
    Dhouib, Amira
    Al Jabor, Aljazi Nasser
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 707 - 713
  • [2] Automatic Unit Test Code Generation Using Large Language Models
    Ocal, Akdeniz Kutay
    Keskinoz, Mehmet
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [3] Perils and opportunities in using large language models in psychological research
    Abdurahman, Suhaib
    Atari, Mohammad
    Karimi-Malekabadi, Farzan
    Xue, Mona J.
    Trager, Jackson
    Park, Peter S.
    Golazizian, Preni
    Omrani, Ali
    Dehghani, Morteza
    PNAS NEXUS, 2024, 3 (07):
  • [4] SkillsInterpreter: A Case Study of Automatic Annotation of Flowcharts to Support Browsing Instructional Videos in Modern Martial Arts using Large Language Models
    Oomori, Kotaro
    Ishiguro, Yoshio
    Rekimoto, Jun
    AUGMENTED HUMANS 2024, AHS 2024, 2024, : 217 - 225
  • [5] Safety analysis in the era of large language models: A case study of STPA using ChatGPT
    Qi, Yi
    Zhao, Xingyu
    Khastgir, Siddartha
    Huang, Xiaowei
    MACHINE LEARNING WITH APPLICATIONS, 2025, 19
  • [6] Automatic detection of contextual laterality in Mammography Reports using Large Language Models
    Godoy, Eduardo
    de Ferrari, Joaquin
    Mellado, Diego
    Chabert, Steren
    Salas, Rodrigo
    2024 14TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS, ICPRS, 2024,
  • [7] Automatic instantiation of assurance cases from patterns using large language models
    Odu, Oluwafemi
    Belle, Alvine B.
    Wang, Song
    Kpodjedo, Segla
    Lethbridge, Timothy C.
    Hemmati, Hadi
    JOURNAL OF SYSTEMS AND SOFTWARE, 2025, 222
  • [8] AUTOMATIC ASSESSMENT OF THE SCALE OF PRODROMAL SYMPTOMS (SOPS) USING LARGE LANGUAGE MODELS
    Agurto, Carla
    Castro, Eduardo
    Reinen, Jenna
    Mohandass, Dheshan
    Srivastava, Agrima
    Penzel, Nora
    Polosecki, Pablo
    Bilgrami, Zarina
    Liebenthal, Einat
    Woods, Scott
    Shenton, Martha
    Kahn, Rene
    McGorry, Patrick
    Kane, John
    Bearden, Carrie E.
    Pasternak, Ofer
    Cecchi, Guillermo
    Wolff, Phillip
    Mizrahi, Romina
    Nelson, Barnaby
    Corcoran, Cheryl
    NEUROPSYCHOPHARMACOLOGY, 2024, 49 : 527 - 528
  • [9] Toward Reproducing Network Research Results Using Large Language Models
    Xiang, Qiao
    Lin, Yuling
    Fang, Mingjun
    Huang, Bang
    Huang, Siyong
    Wen, Ridi
    Le, Franck
    Kong, Linghe
    Shu, Jiwu
    PROCEEDINGS OF THE 22ND ACM WORKSHOP ON HOT TOPICS IN NETWORKS, HOTNETS 2023, 2023, : 56 - 62
  • [10] Autoregressive Self-Evaluation: A Case Study of Music Generation Using Large Language Models
    Banat, Rerker
    Colton, Simon
    2023 IEEE CONFERENCE ON ARTIFICIAL INTELLIGENCE, CAI, 2023, : 264 - 265