Opportunities for retrieval and tool augmented large language models in scientific facilities

被引:6
作者
Prince, Michael H. [1 ]
Chan, Henry [2 ]
Vriza, Aikaterini [2 ]
Zhou, Tao [2 ]
Sastry, Varuni K. [3 ]
Luo, Yanqi [1 ]
Dearing, Matthew T. [4 ]
Harder, Ross J. [1 ]
Vasudevan, Rama K. [5 ]
Cherukara, Mathew J. [1 ]
机构
[1] Argonne Natl Lab, Adv Photon Source, Lemont 60439, IL USA
[2] Argonne Natl Lab, Ctr Nanoscale Mat, Lemont, IL USA
[3] Argonne Natl Lab, Argonne Leadership Comp Facil, Lemont, IL USA
[4] Argonne Natl Lab, Business & Informat Syst, Lemont, IL USA
[5] Oak Ridge Natl Lab, Ctr Nanophase Mat, Oak Ridge, TN USA
关键词
Photons - Problem oriented languages;
D O I
10.1038/s41524-024-01423-2
中图分类号
O64 [物理化学(理论化学)、化学物理学];
学科分类号
070304 ; 081704 ;
摘要
Upgrades to advanced scientific user facilities such as next-generation x-ray light sources, nanoscience centers, and neutron facilities are revolutionizing our understanding of materials across the spectrum of the physical sciences, from life sciences to microelectronics. However, these facility and instrument upgrades come with a significant increase in complexity. Driven by more exacting scientific needs, instruments and experiments become more intricate each year. This increased operational complexity makes it ever more challenging for domain scientists to design experiments that effectively leverage the capabilities of and operate on these advanced instruments. Large language models (LLMs) can perform complex information retrieval, assist in knowledge-intensive tasks across applications, and provide guidance on tool usage. Using x-ray light sources, leadership computing, and nanoscience centers as representative examples, we describe preliminary experiments with a Context-Aware Language Model for Science (CALMS) to assist scientists with instrument operations and complex experimentation. With the ability to retrieve relevant information from facility documentation, CALMS can answer simple questions on scientific capabilities and other operational procedures. With the ability to interface with software tools and experimental hardware, CALMS can conversationally operate scientific instruments. By making information more accessible and acting on user needs, LLMs could expand and diversify scientific facilities' users and accelerate scientific output.
引用
收藏
页数:8
相关论文
共 40 条
[1]   Prepare for truly useful large language models [J].
不详 .
NATURE BIOMEDICAL ENGINEERING, 2023, 7 (02) :85-86
[2]   Augmenting large language models with chemistry tools [J].
Bran, Andres M. ;
Cox, Sam ;
Schilter, Oliver ;
Baldassari, Carlo ;
White, Andrew D. ;
Schwaller, Philippe .
NATURE MACHINE INTELLIGENCE, 2024, 6 (05) :525-535
[3]  
Chiang Wei-Lin, 2023, Vicuna: An open -source chatbot impressing gpt-4 with 90%* chatgpt quality
[4]  
Chroma, 2023, Chroma
[5]   Large Language Models in the Workplace: A Case Study on Prompt Engineering for Job Type Classification [J].
Clavie, Benjamin ;
Ciceu, Alexandru ;
Naylor, Frederick ;
Soulie, Guillaume ;
Brightwell, Thomas .
NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2023, 2023, 13913 :3-17
[6]  
Company M., 2023, The Challenges of Preserving Hot Sauce and Controlling the pH
[7]   SCIENTISTS USED CHATGPT TO GENERATE A WHOLE PAPER FROM DATA [J].
Conroy, Gemma .
NATURE, 2023, 619 (7970) :443-444
[8]  
Driess D, 2023, PR MACH LEARN RES, V202, P8469
[9]  
Gao YF, 2024, Arxiv, DOI [arXiv:2312.10997, DOI 10.48550/ARXIV.2312.10997]
[10]  
He HF, 2022, Arxiv, DOI arXiv:2301.00303