Large language model for interpreting research policy using adaptive two-stage retrieval augmented fine-tuning method

被引:0
|
作者
Ren, Runtao [1 ]
Ma, Jian [1 ]
Zheng, Zhimin [2 ]
机构
[1] City Univ Hong Kong, Dept Informat Syst, Kowloon Tong, Hong Kong, Peoples R China
[2] Natl Nat Sci Fdn China, Bur Planning, Beijing, Peoples R China
关键词
Generative AI; Large Language Model; Retrieval-augmented Generation; Fine-tuning; Interpretability;
D O I
10.1016/j.eswa.2025.127330
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurate interpretation of scientific funding policies is crucial for government funding agencies and research institutions to make informed decisions and allocate research funds effectively. However, current large language model (LLM)-based systems often generate responses without references, leading to a lack of interpretability needed for policy enforcement. This study introduces the Adaptive Two-stage Retrieval Augmented Fine-Tuning (AT-RAFT) method, a novel LLM-based approach specifically designed for science policy interpretation. AT-RAFT incorporates three complementary artifacts: a two-stage retrieval mechanism, adaptive hard-negative fine-tuning, and an interpretable response interface. It is trained directly on policy documents, allowing the model to provide reference answers based on retrieved text while also offering the original policy context to enhance interpretability. Our experiments demonstrate that AT-RAFT improves retrieval accuracy by 48% and generation performance by 44% compared to existing baseline systems, effectively supporting real-world decision-making tasks for stakeholders in research institutions and funding agencies. Our proposed method has been adopted by ScholarMate, the largest professional research social networking platform in China, and is now deployed on their platform, providing global users with access to advanced policy interpretation tools. Additionally, a demo version of the instantiated interface is available at https://github.com/renruntao/ResearchPolicy_RAG.
引用
收藏
页数:16
相关论文
共 26 条
  • [21] LLM-Commentator: Novel fine-tuning strategies of large language models for automatic commentary generation using football event data
    Cook, Alec
    Karakul, Oktay
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [22] Bread Browning Stage Classification Model using VGG-16 Transfer Learning and Fine-tuning with Small Training Dataset
    Tantiphanwadi, Prapassorn
    Malithong, Kritsanun
    ENGINEERING JOURNAL-THAILAND, 2022, 26 (11): : 1 - 12
  • [23] A Multispectral Remote Sensing Crop Segmentation Method Based on Segment Anything Model Using Multistage Adaptation Fine-Tuning
    Song, Binbin
    Yang, Hui
    Wu, Yanlan
    Zhang, Peng
    Wang, Biao
    Han, Guichao
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [24] Multi-objective math problem generation using large language model through an adaptive multi-level retrieval augmentation framework
    Sun, Jianwen
    Shi, Wangzi
    Shen, Xiaoxuan
    Liu, Shengyingjie
    Wei, Luona
    Wan, Qian
    INFORMATION FUSION, 2025, 119
  • [25] Effective Adaptive Strategy Selection Using Extended Fine-Tuning and CNN-Based Surrogate Model in Repeated-Encounter Bilateral Automated Negotiation
    Chang, Shengbo
    Fujita, Katsuhide
    AGENTS AND ARTIFICIAL INTELLIGENCE, ICAART 2023, 2024, 14546 : 310 - 332
  • [26] Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature
    Lozano, Alejandro
    Fleming, Scott L.
    Chiang, Chia-Chun
    Shah, Nigam
    BIOCOMPUTING 2024, PSB 2024, 2024, : 8 - 23