Large language models in plant biology

被引:21
作者
Lam, Hilbert Yuen In [1 ]
Ong, Xing Er [1 ]
Mutwil, Marek [1 ]
机构
[1] Nanyang Technol Univ, Sch Biol Sci, 60 Nanyang Dr, Singapore 637551, Singapore
关键词
AI; decoder; embedding; encoder; foundation model; large language models; transformer;
D O I
10.1016/j.tplants.2024.04.013
中图分类号
Q94 [植物学];
学科分类号
071001 ;
摘要
Large language models (LLMs), such as ChatGPT, have taken the world by storm. However, LLMs are not limited to human language and can be used to analyze sequential data, such as DNA, protein, and gene expression. The resulting foundation models can be repurposed to identify the complex patterns within the data, resulting in powerful, multipurpose prediction tools able to predict the state of cellular systems. This review outlines the different types of LLMs and showcases their recent uses in biology. Since LLMs have not yet been embraced by the plant community, we also cover how these models can be deployed for the plant kingdom.
引用
收藏
页码:1145 / 1155
页数:11
相关论文
共 78 条
[1]   One hundred important questions facing plant science derived using a large language model [J].
Agathokleous, Evgenios ;
Rillig, Matthias C. ;
Penuelas, Josep ;
Yu, Zhen .
TRENDS IN PLANT SCIENCE, 2024, 29 (02) :210-218
[2]   Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence [J].
Ali, Sajid ;
Abuhmed, Tamer ;
El-Sappagh, Shaker ;
Muhammad, Khan ;
Alonso-Moral, Jose M. ;
Confalonieri, Roberto ;
Guidotti, Riccardo ;
Del Ser, Javier ;
Diaz-Rodriguez, Natalia ;
Herrera, Francisco .
INFORMATION FUSION, 2023, 99
[3]   Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning [J].
Alipanahi, Babak ;
Delong, Andrew ;
Weirauch, Matthew T. ;
Frey, Brendan J. .
NATURE BIOTECHNOLOGY, 2015, 33 (08) :831-+
[4]   Trends in ecology and conservation over eight decades [J].
Anderson, Sean C. ;
Elsen, Paul R. ;
Hughes, Brent B. ;
Tonietto, Rebecca K. ;
Bletz, Molly C. ;
Gill, David A. ;
Holgerson, Meredith A. ;
Kuebbing, Sara E. ;
McDonough MacKenzie, Caitlin ;
Meek, Mariah H. ;
Verissimo, Diogo .
FRONTIERS IN ECOLOGY AND THE ENVIRONMENT, 2021, 19 (05) :274-282
[5]   One hundred important questions facing plant science: an international perspective [J].
Armstrong, Emily May ;
Larson, Emily R. ;
Harper, Helen ;
Webb, Cerian R. ;
Dohleman, Frank ;
Araya, Yoseph ;
Meade, Claire ;
Feng, Xiangyan ;
Mukoye, Benard ;
Levin, Maureece J. ;
Lacombe, Benoit ;
Bakirbas, Ahmet ;
Cardoso, Amanda A. ;
Fleury, Delphine ;
Gessler, Arthur ;
Jaiswal, Deepak ;
Onkokesung, Nawaporn ;
Pathare, Varsha S. ;
Phartyal, Shyam S. ;
Sevanto, Sanna A. ;
Wilson, Ida ;
Grierson, Claire S. .
NEW PHYTOLOGIST, 2023, 238 (02) :470-481
[6]   Effective gene expression prediction from sequence by integrating long-range interactions [J].
Avsec, Ziga ;
Agarwal, Vikram ;
Visentin, Daniel ;
Ledsam, Joseph R. ;
Grabska-Barwinska, Agnieszka ;
Taylor, Kyle R. ;
Assael, Yannis ;
Jumper, John ;
Kohli, Pushmeet ;
Kelley, David R. .
NATURE METHODS, 2021, 18 (10) :1196-+
[7]  
Batzoglou S., 2023, Medium
[8]   DNA language models are powerful predictors of genome-wide variant effects [J].
Benegas, Gonzalo ;
Batra, Sanjit Singh ;
Song, Yun S. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2023, 120 (44)
[9]   Before and after AlphaFold2: An overview of protein structure prediction [J].
Bertoline, Leticia M. F. ;
Lima, Angelica N. ;
Krieger, Jose E. ;
Teixeira, Samantha K. .
FRONTIERS IN BIOINFORMATICS, 2023, 3
[10]   ProteinBERT: a universal deep-learning model of protein sequence and function [J].
Brandes, Nadav ;
Ofer, Dan ;
Peleg, Yam ;
Rappoport, Nadav ;
Linial, Michal .
BIOINFORMATICS, 2022, 38 (08) :2102-2110