LLM based Biological Named Entity Recognition from Scientific Literature

被引:5
作者
Jung, Sung Jae [1 ]
Kim, Hajung [2 ]
Jang, Kyoung Sang [1 ]
机构
[1] EnCore Inc, Div DG & AI Business, Seoul, South Korea
[2] EnCore Inc, Div DT Business, Seoul, South Korea
来源
2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024 | 2024年
关键词
Biological Named Entity Recognition (BNER); Large Language Models (LLM); Prompt Engineering; p53; Protein;
D O I
10.1109/BigComp60711.2024.00095
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, the application of Large Language Models (LLMs) in the field of natural language processing has witnessed remarkable growth, revolutionizing the field of bioinformatics by automating the extraction of biological entities from scientific literature. This study presents the development and evaluation of a Biological Named Entity Recognizer (BNER) using a pre-trained Large Language Model (LLM) refined through prompt engineering. The BNER was tailored to identify proteins, genes, and small molecules within scientific texts, specifically targeting the context of p53 protein-related research. To assess the BNER's efficacy, we curated a dataset comprising ten paragraphs extracted from the abstracts and significant sections of five high-relevance scientific papers. The system's performance was quantified through an entity recognition task, resulting in 51 true positives (TP), 10 false positives (FP), and 3 false negatives (FN). The BNER achieved an F1 score of 0.887, demonstrating a high degree of precision and recall. These results validate the utility of LLMs in bioinformatics and highlight the BNER's potential to support and accelerate scientific discovery by providing accurate, structured data outputs suitable for comprehensive analysis.
引用
收藏
页码:433 / 435
页数:3
相关论文
共 9 条
  • [1] Chun HW., 2013, Lecture Notes in Computer Science, V8017
  • [2] Knowledge Discovery and interactive Data Mining in Bioinformatics - State-of-the-Art, future challenges and research directions
    Holzinger, Andreas
    Dehmer, Matthias
    Jurisica, Igor
    [J]. BMC BIOINFORMATICS, 2014, 15
  • [3] Treating p53 Mutant Aggregation-Associated Cancer
    Kanapathipillai, Mathumai
    [J]. CANCERS, 2018, 10 (06)
  • [4] Kim JK, 2023, J PEDIATR UROL, V19, P598, DOI 10.1016/j.jpurol.2023.05.018
  • [5] Therapeutics Targeting p53-MDM2 Interaction to Induce Cancer Cell Death
    Koo, Nayeong
    Sharma, Arun K.
    Narayan, Satya
    [J]. INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2022, 23 (09)
  • [6] The Basally Expressed p53-Mediated Homeostatic Function
    Nagpal, Isha
    Yuan, Zhi-Min
    [J]. FRONTIERS IN CELL AND DEVELOPMENTAL BIOLOGY, 2021, 9
  • [7] pubmed.ncbi.nlm.nih, About us
  • [8] Understanding p53 functions through p53 antibodies
    Sabapathy, Kanaga
    Lane, David P.
    [J]. JOURNAL OF MOLECULAR CELL BIOLOGY, 2019, 11 (04) : 317 - 329
  • [9] Regulation of p53 Function by Formation of Non-Nuclear Heterologous Protein Complexes
    Zavileyskiy, Lev
    Bunik, Victoria
    [J]. BIOMOLECULES, 2022, 12 (02)