CLIP: accurate prediction of disordered linear interacting peptides from protein sequences using co -evolutionary information

被引:8
作者
Peng, Zhenling [1 ]
Li, Zixia [2 ]
Meng, Qiaozhen [3 ]
Zhao, Bi [4 ]
Kurgan, Lukasz [5 ,6 ,7 ,8 ]
机构
[1] Shandong Univ, Jinan, Peoples R China
[2] Tianjin Univ, Ctr Appl Math, Tianjin, Peoples R China
[3] Tianjin Univ, Coll Intelligence & Computing, Tianjin, Peoples R China
[4] Univ S Florida, Computat Core, Tampa, FL USA
[5] AIMBE, Washington, DC 20005 USA
[6] AAIA, Hong Kong, Peoples R China
[7] European Acad Sci & Arts, Vienna, Austria
[8] Virginia Commonwealth Univ, Comp Sci, Richmond, VA 23284 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
intrinsic disorder; protein-protein interactions; protein-nucleic acids interactions; linear interacting peptides; protein function; molecular recognition features; INTRINSIC DISORDER; BINDING REGIONS; WEB SERVER; MORFS; IDENTIFICATION; ANNOTATION; SLIMSEARCH; REPOSITORY; MOTIFS;
D O I
10.1093/bib/bbac502
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
One of key features of intrinsically disordered regions (IDRs) is facilitation of protein-protein and protein-nucleic acids interactions. These disordered binding regions include molecular recognition features (MoRFs), short linear motifs (SLiMs) and longer binding domains. Vast majority of current predictors of disordered binding regions target MoRFs, with a handful of methods that predict SLiMs and disordered protein -binding domains. A new and broader class of disordered binding regions, linear interacting peptides (LIPs), was introduced recently and applied in the MobiDB resource. LIPs are segments in protein sequences that undergo disorderto -order transition upon binding to a protein or a nucleic acid, and they cover MoRFs, SLiMs and disordered protein -binding domains. Although current predictors of MoRFs and disordered protein -binding regions could be used to identify some LIPs, there are no dedicated sequence -based predictors of LIPs. To this end, we introduce CLIP, a new predictor of LIPs that utilizes robust logistic regression model to combine three complementary types of inputs: co -evolutionary information derived from multiple sequence alignments, physicochemical profiles and disorder predictions. Ablation analysis suggests that the co -evolutionary information is particularly useful for this prediction and that combining the three inputs provides substantial improvements when compared to using these inputs individually. Comparative empirical assessments using low-similarity test datasets reveal that CLIP secures area under receiver operating characteristic curve (AUC) of 0.8 and substantially improves over the results produced by the closest current tools that predict MoRFs and disordered protein -binding regions. The webserver of CLIP is freely available at http://biomine.cs.vcu.edu/servers/CLIP/ and the standalone code can be downloaded from http://yanglab.cid.sdu.edu.cntdownload/CLIPt.
引用
收藏
页数:11
相关论文
共 99 条
[1]   The contribution of intrinsically disordered regions to protein function, cellular complexity, and human disease [J].
Babu, M. Madan .
BIOCHEMICAL SOCIETY TRANSACTIONS, 2016, 44 :1185-1200
[2]   Minimotif Miner: a tool for investigating protein function [J].
Balla, S ;
Thapar, V ;
Verma, S ;
Luong, T ;
Faghri, T ;
Huang, CH ;
Rajasekaran, S ;
del Campo, JJ ;
Shinn, JH ;
Mohler, WA ;
Maciejewski, MW ;
Gryk, MR ;
Piccirillo, B ;
Schiller, SR ;
Schiller, MR .
NATURE METHODS, 2006, 3 (03) :175-177
[3]   Bioinformatics Approaches for Predicting Disordered Protein Motifs [J].
Bhowmick, Pallab ;
Guharoy, Mainak ;
Tompa, Peter .
INTRINSICALLY DISORDERED PROTEINS STUDIED BY NMR SPECTROSCOPY, 2015, 870 :291-318
[4]   Protein Data Bank: the single global archive for 3D macromolecular structure data [J].
Burley, Stephen K. ;
Berman, Helen M. ;
Bhikadiya, Charmi ;
Bi, Chunxiao ;
Chen, Li ;
Di Costanzo, Luigi ;
Christie, Cole ;
Duarte, Jose M. ;
Dutta, Shuchismita ;
Feng, Zukang ;
Ghosh, Sutapa ;
Goodsell, David S. ;
Green, Rachel Kramer ;
Guranovic, Vladimir ;
Guzenko, Dmytro ;
Hudson, Brian P. ;
Liang, Yuhe ;
Lowe, Robert ;
Peisach, Ezra ;
Periskova, Irina ;
Randle, Chris ;
Rose, Alexander ;
Sekharan, Monica ;
Shao, Chenghua ;
Tao, Yi-Ping ;
Valasatava, Yana ;
Voigt, Maria ;
Westbrook, John ;
Young, Jasmine ;
Zardecki, Christine ;
Zhuravleva, Marina ;
Kurisu, Genji ;
Nakamura, Haruki ;
Kengaku, Yumiko ;
Cho, Hasumi ;
Sato, Junko ;
Kim, Ju Yaen ;
Ikegawa, Yasuyo ;
Nakagawa, Atsushi ;
Yamashita, Reiko ;
Kudou, Takahiro ;
Bekker, Gert-Jan ;
Suzuki, Hirofumi ;
Iwata, Takeshi ;
Yokochi, Masashi ;
Kobayashi, Naohiro ;
Fujiwara, Toshimichi ;
Velankar, Sameer ;
Kleywegt, Gerard J. ;
Anyango, Stephen .
NUCLEIC ACIDS RESEARCH, 2019, 47 (D1) :D520-D528
[5]   Intrinsically Disordered Proteins: Structure, Function and Therapeutics [J].
Chen, Jianhan ;
Kriwacki, Richard W. .
JOURNAL OF MOLECULAR BIOLOGY, 2018, 430 (16) :2275-2277
[6]   Attributes of short linear motifs [J].
Davey, Norman E. ;
Van Roey, Kim ;
Weatheritt, Robert J. ;
Toedt, Grischa ;
Uyar, Bora ;
Altenberg, Brigitte ;
Budd, Aidan ;
Diella, Francesca ;
Dinkel, Holger ;
Gibson, Toby J. .
MOLECULAR BIOSYSTEMS, 2012, 8 (01) :268-281
[7]   SLiMSearch 2.0: biological context for short linear motifs in proteins [J].
Davey, Norman E. ;
Haslam, Niall J. ;
Shields, Denis C. ;
Edwards, Richard J. .
NUCLEIC ACIDS RESEARCH, 2011, 39 :W56-W60
[8]   MobiDB: a comprehensive database of intrinsic protein disorder annotations [J].
Di Domenico, Tomas ;
Walsh, Ian ;
Martin, Alberto J. M. ;
Tosatto, Silvio C. E. .
BIOINFORMATICS, 2012, 28 (15) :2080-2081
[9]   ELM 2016-data update and new functionality of the eukaryotic linear motif resource [J].
Dinkel, Holger ;
Van Roey, Kim ;
Michael, Sushama ;
Kumar, Manjeet ;
Uyar, Bora ;
Altenberg, Brigitte ;
Milchevskaya, Vladislava ;
Schneider, Melanie ;
Kuehn, Helen ;
Behrendt, Annika ;
Dahl, Sophie Luise ;
Damerell, Victoria ;
Diebel, Sandra ;
Kalman, Sara ;
Klein, Steffen ;
Knudsen, Arne C. ;
Maeder, Christina ;
Merrill, Sabina ;
Staudt, Angelina ;
Thiel, Vera ;
Welti, Lukas ;
Davey, Norman E. ;
Diella, Francesca ;
Gibson, Toby J. .
NUCLEIC ACIDS RESEARCH, 2016, 44 (D1) :D294-D300
[10]   ELM-the database of eukaryotic linear motifs [J].
Dinkel, Holger ;
Michael, Sushama ;
Weatheritt, Robert J. ;
Davey, Norman E. ;
Van Roey, Kim ;
Altenberg, Brigitte ;
Toedt, Grischa ;
Uyar, Bora ;
Seiler, Markus ;
Budd, Aidan ;
Joedicke, Lisa ;
Dammert, Marcel A. ;
Schroeter, Christian ;
Hammer, Maria ;
Schmidt, Tobias ;
Jehl, Peter ;
McGuigan, Caroline ;
Dymecka, Magdalena ;
Chica, Claudia ;
Luck, Katja ;
Via, Allegra ;
Chatr-Aryamontri, Andrew ;
Haslam, Niall ;
Grebnev, Gleb ;
Edwards, Richard J. ;
Steinmetz, Michel O. ;
Meiselbach, Heike ;
Diella, Francesca ;
Gibson, Toby J. .
NUCLEIC ACIDS RESEARCH, 2012, 40 (D1) :D242-D251