Associating biological context with protein-protein interactions through text mining at PubMed scale

被引:2
|
作者
Sosa, Daniel N. [1 ]
Hintzen, Rogier [2 ]
Xiong, Betty [1 ]
de Giorgio, Alex [2 ]
Fauqueur, Julien [2 ]
Davies, Mark [2 ]
Lever, Jake [3 ]
Altman, Russ B. [4 ,5 ]
机构
[1] Stanford Univ, Dept Biomed Data Sci, Stanford, CA USA
[2] BenevolentAI, London, England
[3] Univ Glasgow, Glasgow, Scotland
[4] Stanford Univ, Dept Bioengn, Stanford, CA 94305 USA
[5] Stanford Univ, Dept Genet, Stanford, CA USA
关键词
Literature-based discovery; NLP; Knowledge graphs; Cellular biology; Artificial intelligence;
D O I
10.1016/j.jbi.2023.104474
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Inferring knowledge from known relationships between drugs, proteins, genes, and diseases has great potential for clinical impact, such as predicting which existing drugs could be repurposed to treat rare diseases. Incorporating key biological context such as cell type or tissue of action into representations of extracted biomedical knowledge is essential for principled pharmacological discovery. Existing global, literature-derived knowledge graphs of interactions between drugs, proteins, genes, and diseases lack this essential information. In this study, we frame the task of associating biological context with protein-protein interactions extracted from text as a classification task using syntactic, semantic, and novel meta-discourse features. We introduce the Insider corpora, which are automatically generated PubMed-scale corpora for training classifiers for the context association task. These corpora are created by searching for precise syntactic cues of cell type and tissue relevancy to extracted regulatory relations. We report F1 scores of 0.955 and 0.862 for identifying relevant cell types and tissues, respectively, for our identified relations. By classifying with this framework, we demonstrate that the problem of context association can be addressed using intuitive, interpretable features. We demonstrate the potential of this approach to enrich text-derived knowledge bases with biological detail by incorporating cell type context into a protein-protein network for dengue fever.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Protein-protein interaction predictions using text mining methods
    Papanikolaou, Niko Las
    Pavlopoulos, Georgios A.
    Theodosiou, Theodosios
    Iliopoulos, Ioannis
    METHODS, 2015, 74 : 47 - 53
  • [22] Analysis of protein/protein interactions through biomedical literature: Text mining of abstracts vs. text mining of full text articles
    Martin, EPG
    Bremer, EG
    Guerin, MC
    DeSesa, C
    Jouve, O
    KNOWLEDGE EXPLORATION IN LIFE SCIENCE INFORMATICS, PROCEEDINGS, 2004, 3303 : 96 - 108
  • [23] How to link ontologies and protein-protein interactions to literature: text-mining approaches and the BioCreative experience
    Krallinger, Martin
    Leitner, Florian
    Vazquez, Miguel
    Salgado, David
    Marcelle, Christophe
    Tyers, Mike
    Valencia, Alfonso
    Chatr-aryamontri, Andrew
    DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2012,
  • [24] Induction of flexibility through protein-protein interactions
    Fayos, R
    Melacini, G
    Newlon, MG
    Burns, L
    Scott, JD
    Jennings, PA
    JOURNAL OF BIOLOGICAL CHEMISTRY, 2003, 278 (20) : 18581 - 18587
  • [25] Elucidating protein-protein interactions through cheminformatics
    Bush, Stephen J.
    Fourches, Denis
    Tropsha, Alexander
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 2011, 242
  • [26] Mining Impact of Protein Modifications on Protein-Protein Interactions from Literature
    Siu, Amy
    Arighi, Cecilia
    Nchoutmboube, Jules
    Tudor, Catalina O.
    Vijay-Shanker, K.
    Wu, Cathy H.
    BIBMW: 2009 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOP, 2009, : 343 - 343
  • [27] Relevance of protein-protein interactions on the biological identity of nanoparticles
    Vasti, Cecilia
    Bonnet, Laura V.
    Galiano, Mauricio R.
    Rojas, Ricardo
    Giacomelli, Carla E.
    COLLOIDS AND SURFACES B-BIOINTERFACES, 2018, 166 : 330 - 338
  • [28] Multilevel regulation of protein-protein interactions in biological circuitry
    Beckett, D
    PHYSICAL BIOLOGY, 2005, 2 (02) : S67 - S73
  • [29] Document classification for mining host pathogen protein-protein interactions
    Yin, Lanlan
    Xu, Guixian
    Torii, Manabu
    Niu, Zhendong
    Maisog, Jose M.
    Wu, Cathy
    Hu, Zhangzhi
    Liu, Hongfang
    ARTIFICIAL INTELLIGENCE IN MEDICINE, 2010, 49 (03) : 155 - 160
  • [30] Revealing protein-protein interactions at the transcriptome scale by sequencing
    Johnson, Kara L.
    Qi, Zhijie
    Yan, Zhangming
    Wen, Xingzhao
    Nguyen, Tri C.
    Zaleta-Rivera, Kathia
    Chen, Chien-Ju
    Fan, Xiaochen
    Sriram, Kiran
    Wan, Xueyi
    Chen, Zhen Bouman
    Zhong, Sheng
    MOLECULAR CELL, 2021, 81 (19) : 4091 - +