Associating biological context with protein-protein interactions through text mining at PubMed scale

被引:2
|
作者
Sosa, Daniel N. [1 ]
Hintzen, Rogier [2 ]
Xiong, Betty [1 ]
de Giorgio, Alex [2 ]
Fauqueur, Julien [2 ]
Davies, Mark [2 ]
Lever, Jake [3 ]
Altman, Russ B. [4 ,5 ]
机构
[1] Stanford Univ, Dept Biomed Data Sci, Stanford, CA USA
[2] BenevolentAI, London, England
[3] Univ Glasgow, Glasgow, Scotland
[4] Stanford Univ, Dept Bioengn, Stanford, CA 94305 USA
[5] Stanford Univ, Dept Genet, Stanford, CA USA
关键词
Literature-based discovery; NLP; Knowledge graphs; Cellular biology; Artificial intelligence;
D O I
10.1016/j.jbi.2023.104474
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Inferring knowledge from known relationships between drugs, proteins, genes, and diseases has great potential for clinical impact, such as predicting which existing drugs could be repurposed to treat rare diseases. Incorporating key biological context such as cell type or tissue of action into representations of extracted biomedical knowledge is essential for principled pharmacological discovery. Existing global, literature-derived knowledge graphs of interactions between drugs, proteins, genes, and diseases lack this essential information. In this study, we frame the task of associating biological context with protein-protein interactions extracted from text as a classification task using syntactic, semantic, and novel meta-discourse features. We introduce the Insider corpora, which are automatically generated PubMed-scale corpora for training classifiers for the context association task. These corpora are created by searching for precise syntactic cues of cell type and tissue relevancy to extracted regulatory relations. We report F1 scores of 0.955 and 0.862 for identifying relevant cell types and tissues, respectively, for our identified relations. By classifying with this framework, we demonstrate that the problem of context association can be addressed using intuitive, interpretable features. We demonstrate the potential of this approach to enrich text-derived knowledge bases with biological detail by incorporating cell type context into a protein-protein network for dengue fever.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] PPI Finder: A Mining Tool for Human Protein-Protein Interactions
    He, Min
    Wang, Yi
    Li, Wei
    PLOS ONE, 2009, 4 (02):
  • [32] DAPPER: a data-mining resource for protein-protein interactions
    Haider, Syed
    Lipinszki, Zoltan
    Przewloka, Marcin R.
    Ladak, Yaseen
    D'Avino, Pier Paolo
    Kimata, Yuu
    Lio, Pietro
    Glover, David M.
    BIODATA MINING, 2015, 8
  • [33] DAPPER: a data-mining resource for protein-protein interactions
    Syed Haider
    Zoltan Lipinszki
    Marcin R. Przewloka
    Yaseen Ladak
    Pier Paolo D’Avino
    Yuu Kimata
    Pietro Lio’
    David M. Glover
    BioData Mining, 8
  • [34] Document Classification for Mining Host Pathogen Protein-Protein Interactions
    Xu, Guixian
    Yin, Lanlan
    Torii, Manabu
    Niu, Zhendong
    Wu, Cathy
    Hu, Zhangzhi
    Liu, Hongfang
    2008 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, PROCEEDINGS, 2008, : 461 - +
  • [35] Novel Protein-Protein Interactions Inferred from Literature Context
    van Haagen, Herman H. H. B. M.
    't Hoen, Peter A. C.
    Bovo, Alessandro Botelho
    de Morree, Antoine
    van Mulligen, Erik M.
    Chichester, Christine
    Kors, Jan A.
    den Dunnen, Johan T.
    van Ommen, Gert-Jan B.
    van der Maarel, Silvere M.
    Kern, Vinicius Medina
    Mons, Barend
    Schuemie, Martijn J.
    PLOS ONE, 2009, 4 (11):
  • [36] Wiring the cell through modular protein-protein interactions
    Pawson, T.
    NEUROSIGNALS, 2006, 15 (01) : 47 - 47
  • [37] Detection of Protein-Protein Interactions Through Vesicle Targeting
    Boysen, Jacob H.
    Fanning, Saranna
    Newberg, Justin
    Murphy, Robert F.
    Mitchell, Aaron P.
    GENETICS, 2009, 182 (01) : 33 - 39
  • [38] Signalling through chromatin modifications and protein-protein interactions
    Acharya, Asha
    Kuo, Min-Hao
    Biotechnology and Genetic Engineering Reviews, 2006, 23 : 105 - 127
  • [39] Bayesian inference of protein-protein interactions from biological literature
    Chowdhary, Rajesh
    Zhang, Jinfeng
    Liu, Jun S.
    BIOINFORMATICS, 2009, 25 (12) : 1536 - 1542
  • [40] Crystallography and protein-protein interactions: biological interfaces and crystal contacts
    Kobe, Bostjan
    Guncar, Gregor
    Buchholz, Rebecca
    Huber, Thomas
    Maco, Bohumil
    Cowieson, Nathan
    Martin, Jennifer L.
    Marfori, Mary
    Forwood, Jade K.
    BIOCHEMICAL SOCIETY TRANSACTIONS, 2008, 36 : 1438 - 1441