Predicting the involvement of polyQ- and polyA in protein-protein interactions by their amino acid context

被引:0
|
作者
Mier, Pablo [1 ]
Andrade-Navarro, Miguel A. [1 ]
机构
[1] Johannes Gutenberg Univ Mainz, Inst Organism & Mol Evolut, Fac Biol, Hans Dieter Husch Weg 15, D-55128 Mainz, Germany
关键词
Homorepeat; Polyglutamine; Polyalanine; Protein-protein interaction; Machine learning; STRUCTURAL BASIS; AGGREGATION; RECOGNITION; HOMOREPEATS; POLYALANINE; EVOLUTION; EXPANSION; REGIONS; FIR;
D O I
10.1016/j.heliyon.2024.e37861
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Homorepeats, specifically polyglutamine (polyQ) and polyalanine (polyA), are often implicated in protein-protein interactions (PPIs). So far, a method to predict the participation of homorepeats in protein interactions is lacking. We propose a machine learning approach to identify PPI-involved polyQ and polyA regions within the human proteome based on known interacting regions. Using the dataset of human homorepeats, we identified 157 polyQ and 745 polyA regions potentially involved in PPIs. Machine learning models, trained on amino acid context and homorepeat length, demonstrated high precision (0.90-0.98) but variable recall (0.42-0.85). Random forest outperformed other models (AUC polyQ = 0.686, AUC polyA = 0.732) using the positions surrounding the homorepeat -10 to +10. Integrating paralog information marginally improved predictions but was excluded for model simplicity. Further optimization revealed that for polyQ, using amino acid surrounding positions from -6 to +6 increased AUC to 0.715. For polyA, no improvement was found. Incorporating coiled coil overlap information enhanced polyA predictions (AUC = 0.745) but not polyQ. Finally, we applied these models to predict PPI involvement across all polyQ and polyA regions, identifying potential interactions. Case studies illustrated the method's predictive capacity, highlighting known interacting regions with high scores and elucidating potential false negatives.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Predicting protein-protein interactions with pseudo amino acid composition
    Cai, Yu-Dong
    Zhou, Guo-Ping
    BMEI 2008: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIOMEDICAL ENGINEERING AND INFORMATICS, VOL 1, 2008, : 158 - +
  • [2] Exploiting Amino Acid Composition for Predicting Protein-Protein Interactions
    Roy, Sushmita
    Martinez, Diego
    Platero, Harriett
    Lane, Terran
    Werner-Washburne, Margaret
    PLOS ONE, 2009, 4 (11):
  • [3] Predicting protein-protein interactions in the context of protein evolution
    Lewis, Anna C. F.
    Saeed, Ramazan
    Deane, Charlotte M.
    MOLECULAR BIOSYSTEMS, 2010, 6 (01) : 55 - 64
  • [4] Predicting protein-protein interactions by weighted pseudo amino acid composition
    Goktepe, Yunus Emre
    Ilhan, Ilhan
    Kahramanli, Sirzat
    INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2016, 15 (03) : 272 - 290
  • [5] A bifunctional amino acid to study protein-protein interactions
    Yang, Tangpo
    Li, Xin
    Li, Xiang David
    RSC ADVANCES, 2020, 10 (69) : 42076 - 42083
  • [6] A novel matrix of sequence descriptors for predicting protein-protein interactions from amino acid sequences
    Wang, Xue
    Wu, Yuejin
    Wang, Rujing
    Wei, Yuanyuan
    Gui, Yuanmiao
    PLOS ONE, 2019, 14 (06):
  • [7] Involvement of sugars in protein-protein interactions
    Qasba, PK
    CARBOHYDRATE POLYMERS, 2000, 41 (03) : 293 - 309
  • [8] Predicting global protein-protein interactions
    Rachel Brem
    Genome Biology, 1 (1)
  • [9] Hyperplanes for predicting protein-protein interactions
    Nanni, L
    NEUROCOMPUTING, 2005, 69 (1-3) : 257 - 263
  • [10] Analysis of Amino Acid Pairs Relationships Based on Protein-Protein Interactions
    Thepsutum, Kittirat
    Ngamsuriyaroj, Sudsanguan
    2015 INTERNATIONAL COMPUTER SCIENCE AND ENGINEERING CONFERENCE (ICSEC), 2015, : 193 - 197