Plan, Generate and Match: Scientific Workflow Recommendation with Large Language Models

被引:2
作者
Gu, Yang [1 ]
Cao, Jian [1 ]
Guo, Yuan [1 ]
Qian, Shiyou [1 ]
Guan, Wei [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
来源
SERVICE-ORIENTED COMPUTING, ICSOC 2023, PT I | 2023年 / 14419卷
基金
美国国家科学基金会;
关键词
Scientific Workflow Recommendation; Large Language Models; Planning; Prompting;
D O I
10.1007/978-3-031-48421-6_7
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The recommendation of scientific workflows from public repositories that meet users' natural language requirements is becoming increasingly essential in the scientific community. Nevertheless, existing methods that rely on direct text matching encounter difficulties when it comes to handling complex queries, which ultimately results in poor performance. Large language models (LLMs) have recently exhibited exceptional ability in planning and reasoning. We propose " Plan, Generate and Match" (PGM), a scientific workflow recommendation method leveraging LLMs. PGM consists of three stages: utilizing LLMs to conduct planning upon receiving a user query, generating a structured workflow specification guided by the solution steps, and using these plans and specifications to match with candidate workflows. By incorporating the planning mechanism, PGM leverages few-shot prompting to automatically generate well-considered steps for instructing the recommendation of reliable workflows. This method represents the first exploration of incorporating LLMs into the scientific workflow domain. Experimental results on real-world benchmarks demonstrate that PGM outperforms state-of-the-art methods with statistical significance, highlighting its immense potential in addressing complex requirements.
引用
收藏
页码:86 / 102
页数:17
相关论文
共 25 条
[1]   Canonical Workflow for Machine Learning Tasks [J].
Blanchi, Christophe ;
Gebre, Binyam ;
Wittenburg, Peter .
DATA INTELLIGENCE, 2022, 4 (02) :173-185
[2]   WorkflowHub: Community Framework for Enabling Scientific Workflow Research and Development [J].
da Silva, Rafael Ferreira ;
Pottier, Loic ;
Coleman, Taina ;
Deelman, Ewa ;
Casanova, Henri .
PROCEEDINGS OF 15TH WORKSHOP ON WORKFLOWS IN SUPPORT OF LARGE-SCALE SCIENCE (WORKS), 2020, :49-56
[3]   The design and realisation of the myExperiment Virtual Research Environment for social sharing of workflows [J].
De Roure, David ;
Goble, Carole ;
StevenS, Robert .
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2009, 25 (05) :561-567
[4]   Developing and reusing bioinformatics data analysis pipelines using scientific workflow systems [J].
Djaffardjy, Marine ;
Marchment, George ;
Sebe, Clemence ;
Blanchet, Raphael ;
Bellajhame, Khalid ;
Gaignard, Alban ;
Lemoine, Frederic ;
Cohen-Boulakia, Sarah .
COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 :2075-2085
[5]  
Gu Y., 2023, IEEE Trans. Serv. Comput.
[6]  
Gu Y., 2023, Concurrency Comput. Pract. Exp., pe7736
[7]  
Jiang X, 2024, Arxiv, DOI [arXiv:2303.06689, DOI 10.48550/ARXIV.2303.06689]
[8]  
Kojima T, 2022, Arxiv, DOI [arXiv:2205.11916, DOI 10.48550/ARXIV.2205.11916]
[9]   The Hungarian Method for the assignment problem [J].
Kuhn, HW .
NAVAL RESEARCH LOGISTICS, 2005, 52 (01) :7-21
[10]   Research progress on evaluation methods and factors influencing shale brittleness: A review [J].
Li, Hu .
ENERGY REPORTS, 2022, 8 :4344-4358