Predicting the involvement of polyQ- and polyA in protein-protein interactions by their amino acid context

被引:0
|
作者
Mier, Pablo [1 ]
Andrade-Navarro, Miguel A. [1 ]
机构
[1] Johannes Gutenberg Univ Mainz, Inst Organism & Mol Evolut, Fac Biol, Hans Dieter Husch Weg 15, D-55128 Mainz, Germany
关键词
Homorepeat; Polyglutamine; Polyalanine; Protein-protein interaction; Machine learning; STRUCTURAL BASIS; AGGREGATION; RECOGNITION; HOMOREPEATS; POLYALANINE; EVOLUTION; EXPANSION; REGIONS; FIR;
D O I
10.1016/j.heliyon.2024.e37861
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Homorepeats, specifically polyglutamine (polyQ) and polyalanine (polyA), are often implicated in protein-protein interactions (PPIs). So far, a method to predict the participation of homorepeats in protein interactions is lacking. We propose a machine learning approach to identify PPI-involved polyQ and polyA regions within the human proteome based on known interacting regions. Using the dataset of human homorepeats, we identified 157 polyQ and 745 polyA regions potentially involved in PPIs. Machine learning models, trained on amino acid context and homorepeat length, demonstrated high precision (0.90-0.98) but variable recall (0.42-0.85). Random forest outperformed other models (AUC polyQ = 0.686, AUC polyA = 0.732) using the positions surrounding the homorepeat -10 to +10. Integrating paralog information marginally improved predictions but was excluded for model simplicity. Further optimization revealed that for polyQ, using amino acid surrounding positions from -6 to +6 increased AUC to 0.715. For polyA, no improvement was found. Incorporating coiled coil overlap information enhanced polyA predictions (AUC = 0.745) but not polyQ. Finally, we applied these models to predict PPI involvement across all polyQ and polyA regions, identifying potential interactions. Case studies illustrated the method's predictive capacity, highlighting known interacting regions with high scores and elucidating potential false negatives.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] The interactome: Predicting the protein-protein interactions in cells
    Plewczynski, Dariusz
    Ginalski, Krzysztof
    CELLULAR & MOLECULAR BIOLOGY LETTERS, 2009, 14 (01) : 1 - 22
  • [22] Predicting the essentialities of protein-protein interactions in cancer
    Cooper, Lee A. D.
    Moran, Josue D.
    Li, Zenggang
    Du, Yuhong
    Harati, Sahar
    Ivanov, Andrey A.
    Webber, Phillip
    Havel, Jonathan J.
    Johns, Margaret A.
    Fu, Haian
    Moreno, Carlos S.
    CANCER RESEARCH, 2015, 75 (22)
  • [23] Predicting protein-protein interactions by association mining
    Kotlyar, M
    Jurisica, I
    INFORMATION SYSTEMS FRONTIERS, 2006, 8 (01) : 37 - 46
  • [24] Predicting Protein-Protein Interactions by Association Mining
    Information Systems Frontiers, 2006, 8 : 37 - 47
  • [25] Information assessment on predicting protein-protein interactions
    Lin, N
    Wu, BL
    Jansen, R
    Gerstein, M
    Zhao, HY
    BMC BIOINFORMATICS, 2004, 5 (1)
  • [26] ProteinPrompt: a webserver for predicting protein-protein interactions
    Canzler, Sebastian
    Fischer, Markus
    Ulbricht, David
    Ristic, Nikola
    Hildebrand, Peter W.
    Staritzbichler, Rene
    BIOINFORMATICS ADVANCES, 2022, 2 (01):
  • [27] On the specificity of protein-protein interactions in the context of disorder
    Teilum, Kaare
    Olsen, Johan G.
    Kragelund, Birthe B.
    BIOCHEMICAL JOURNAL, 2021, 478 (11) : 2035 - 2050
  • [28] Modeling Protein-Protein Interface Interactions as a Means for Predicting Protein-Protein Interaction Partners
    Reyes, Vicente M.
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2009, 26 (06): : 873 - 873
  • [29] Prediction of Protein-Protein Interactions Using Local Description of Amino Acid Sequence
    Zhou, Yu Zhen
    Gao, Yun
    Zheng, Ying Ying
    ADVANCES IN COMPUTER SCIENCE AND EDUCATION APPLICATIONS, PT II, 2011, 202 : 254 - +
  • [30] iPfam:: visualization of protein-protein interactions in PDB at domain and amino acid resolutions
    Finn, RD
    Marshall, M
    Bateman, A
    BIOINFORMATICS, 2005, 21 (03) : 410 - 412