PairK: Pairwise k-mer alignment for quantifying protein motif conservation in disordered regions

被引:0
作者
Halpin, Jackson C. [1 ]
Keating, Amy E. [1 ,2 ,3 ]
机构
[1] MIT, Dept Biol, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] MIT, Dept Biol Engn, Cambridge, MA USA
[3] Koch Inst Integrat Canc Res, Cambridge, MA USA
基金
美国国家卫生研究院;
关键词
conservation; intrinsically disordered proteins; multiple sequence alignment; short linear motif; MULTIPLE SEQUENCE ALIGNMENT; NF-KAPPA-B; EVH1; DOMAIN; BINDING-SITE; CD-HIT; RECOGNITION; ACTIVATION; RESIDUES; LIGAND; SPECIFICITY;
D O I
10.1002/pro.70004
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Protein-protein interactions are often mediated by a modular peptide recognition domain binding to a short linear motif (SLiM) in the disordered region of another protein. To understand the features of SLiMs that are important for binding and to identify motif instances that are important for biological function, it is useful to examine the evolutionary conservation of motifs across homologous proteins. However, the intrinsically disordered regions (IDRs) in which SLiMs reside evolve rapidly. Consequently, multiple sequence alignment (MSA) of IDRs often misaligns SLiMs and underestimates their conservation. We present PairK (pairwise k-mer alignment), an MSA-free method to align and quantify the relative local conservation of subsequences within an IDR. Lacking a ground truth for conservation, we tested PairK on the task of distinguishing biologically important motif instances from background motifs, under the assumption that biologically important motifs are more conserved. The method outperforms both standard MSA-based conservation scores and a modern LLM-based conservation score predictor. PairK can quantify conservation over wider phylogenetic distances than MSAs, indicating that some SLiMs are more conserved than MSA-based metrics imply. PairK is available as an open-source python package at . It is designed to be easily adapted for use with other SLiM tools and for diverse applications.
引用
收藏
页数:18
相关论文
共 72 条
[1]   A Noncanonical Binding Site in the EVH1 Domain of Vasodilator-Stimulated Phosphoprotein Regulates Its Interactions with the Proline Rich Region of Zyxin [J].
Acevedo, Lucila Andrea ;
Greenwood, Alexander I. ;
Nicholson, Linda K. .
BIOCHEMISTRY, 2017, 56 (35) :4626-4636
[2]   Gene Ontology: tool for the unification of biology [J].
Ashburner, M ;
Ball, CA ;
Blake, JA ;
Botstein, D ;
Butler, H ;
Cherry, JM ;
Davis, AP ;
Dolinski, K ;
Dwight, SS ;
Eppig, JT ;
Harris, MA ;
Hill, DP ;
Issel-Tarver, L ;
Kasarskis, A ;
Lewis, S ;
Matese, JC ;
Richardson, JE ;
Ringwald, M ;
Rubin, GM ;
Sherlock, G .
NATURE GENETICS, 2000, 25 (01) :25-29
[3]   Proteome-Wide Discovery of Evolutionary Conserved Sequences in Disordered Regions [J].
Ba, Alex N. Nguyen ;
Yeh, Brian J. ;
van Dyk, Dewald ;
Davidson, Alan R. ;
Andrews, Brenda J. ;
Weiss, Eric L. ;
Moses, Alan M. .
SCIENCE SIGNALING, 2012, 5 (215)
[4]   Dual epitope recognition by the VASP EVH1 domain modulates polyproline ligand specificity and binding affinity [J].
Ball, LJ ;
Kühne, R ;
Hoffmann, B ;
Häfner, A ;
Schmieder, P ;
Volkmer-Engert, R ;
Hof, M ;
Wahl, M ;
Schneider-Mergener, J ;
Walter, U ;
Oschkinat, H ;
Jarchau, T .
EMBO JOURNAL, 2000, 19 (18) :4903-4914
[5]   Repulsive axon guidance: Abelson and enabled play opposing roles downstream of the roundabout receptor [J].
Bashaw, GJ ;
Kidd, T ;
Murray, D ;
Pawson, T ;
Goodman, CS .
CELL, 2000, 101 (07) :703-715
[6]   Proteome-scale mapping of binding sites in the unstructured regions of the human proteome [J].
Benz, Caroline ;
Ali, Muhammad ;
Krystkowiak, Izabella ;
Simonetti, Leandro ;
Sayadi, Ahmed ;
Mihalic, Filip ;
Kliche, Johanna ;
Andersson, Eva ;
Jemth, Per ;
Davey, Norman E. ;
Ivarsson, Ylva .
MOLECULAR SYSTEMS BIOLOGY, 2022, 18 (01)
[7]   Tes, a specific Mena interacting partner, breaks the rules for EVH1 binding [J].
Boeda, Batiste ;
Briggs, David C. ;
Higgins, Theresa ;
Garvalov, Boyan K. ;
Fadden, Andrew J. ;
McDonald, Neil Q. ;
Way, Michael .
MOLECULAR CELL, 2007, 28 (06) :1071-1082
[8]   Leveraging New Definitions of the LxVP SLiM To Discover Novel Calcineurin Regulators and Substrates [J].
Brauer, Brooke L. ;
Moon, Thomas M. ;
Sheftic, Sarah R. ;
Nasa, Isha ;
Page, Rebecca ;
Peti, Wolfgang ;
Kettenbach, Arminja N. .
ACS CHEMICAL BIOLOGY, 2019, 14 (12) :2672-2682
[9]   Interactions by Disorder - A Matter of Context [J].
Bugge, Katrine ;
Brakti, Inna ;
Fernandes, Catarina B. ;
Dreier, Jesper E. ;
Lundsgaard, Jeppe E. ;
Olsen, Johan G. ;
Skriver, Karen ;
Kragelund, Birthe B. .
FRONTIERS IN MOLECULAR BIOSCIENCES, 2020, 7
[10]   Predicting functionally important residues from sequence conservation [J].
Capra, John A. ;
Singh, Mona .
BIOINFORMATICS, 2007, 23 (15) :1875-1882