Characterization and identification of ubiquitin conjugation sites with E3 ligase recognition specificities

被引:17
作者
Nguyen, Van-Nui [1 ]
Huang, Kai-Yao [1 ]
Huang, Chien-Hsun [1 ,2 ]
Chang, Tzu-Hao [3 ]
Bretana, Neil Arvin [1 ]
Lai, K. Robert [1 ,4 ]
Weng, Julia Tzu-Ya [1 ,4 ]
Lee, Tzong-Yi [1 ,4 ]
机构
[1] Yuan Ze Univ, Dept Comp Sci & Engn, Taoyuan 320, Taiwan
[2] Tao Yuan Hosp, Minist Hlth & Welf, Taoyuan 320, Taiwan
[3] Taipei Med Univ, Grad Inst Biomed Informat, Taipei 110, Taiwan
[4] Yuan Ze Univ, Innovat Ctr Big Data & Digital Convergence, Taoyuan 320, Taiwan
来源
BMC BIOINFORMATICS | 2015年 / 16卷
关键词
MAXIMAL DEPENDENCE DECOMPOSITION; PHOSPHORYLATION SITES; WEB SERVER; PREDICTION; DATABASE; DBPTM; UBIQUITYLATION; CLASSIFIER; PROTEINS;
D O I
10.1186/1471-2105-16-S1-S1
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: In eukaryotes, ubiquitin-conjugation is an important mechanism underlying proteasome-mediated degradation of proteins, and as such, plays an essential role in the regulation of many cellular processes. In the ubiquitin-proteasome pathway, E3 ligases play important roles by recognizing a specific protein substrate and catalyzing the attachment of ubiquitin to a lysine (K) residue. As more and more experimental data on ubiquitin conjugation sites become available, it becomes possible to develop prediction models that can be scaled to big data. However, no development that focuses on the investigation of ubiquitinated substrate specificities has existed. Herein, we present an approach that exploits an iteratively statistical method to identify ubiquitin conjugation sites with substrate site specificities. Results: In this investigation, totally 6259 experimentally validated ubiquitinated proteins were obtained from dbPTM. After having filtered out homologous fragments with 40% sequence identity, the training data set contained 2658 ubiquitination sites (positive data) and 5532 non-ubiquitinated sites (negative data). Due to the difficulty in characterizing the substrate site specificities of E3 ligases by conventional sequence logo analysis, a recursively statistical method has been applied to obtain significant conserved motifs. The profile hidden Markov model (profile HMM) was adopted to construct the predictive models learned from the identified substrate motifs. A five-fold cross validation was then used to evaluate the predictive model, achieving sensitivity, specificity, and accuracy of 73.07%, 65.46%, and 67.93%, respectively. Additionally, an independent testing set, completely blind to the training data of the predictive model, was used to demonstrate that the proposed method could provide a promising accuracy (76.13%) and outperform other ubiquitination site prediction tool. Conclusion: A case study demonstrated the effectiveness of the characterized substrate motifs for identifying ubiquitination sites. The proposed method presents a practical means of preliminary analysis and greatly diminishes the total number of potential targets required for further experimental confirmation. This method may help unravel their mechanisms and roles in E3 recognition and ubiquitin-mediated protein degradation.
引用
收藏
页数:11
相关论文
共 36 条
  • [1] The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003
    Boeckmann, B
    Bairoch, A
    Apweiler, R
    Blatter, MC
    Estreicher, A
    Gasteiger, E
    Martin, MJ
    Michoud, K
    O'Donovan, C
    Phan, I
    Pilbout, S
    Schneider, M
    [J]. NUCLEIC ACIDS RESEARCH, 2003, 31 (01) : 365 - 370
  • [2] Prediction of complete gene structures in human genomic DNA
    Burge, C
    Karlin, S
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1997, 268 (01) : 78 - 94
  • [3] The ubiquitin-mediated protein degradation pathway in cancer: therapeutic implications
    Burger, AM
    Seth, AK
    [J]. EUROPEAN JOURNAL OF CANCER, 2004, 40 (15) : 2217 - 2229
  • [4] Prediction of lysine ubiquitination with mRMR feature selection and analysis
    Cai, Yudong
    Huang, Tao
    Hu, Lele
    Shi, Xiaohe
    Xie, Lu
    Li, Yixue
    [J]. AMINO ACIDS, 2012, 42 (04) : 1387 - 1395
  • [5] Chen T., 2014, Plos One, V9
  • [6] Incorporating key position and amino acid residue features to identify general and species-specific Ubiquitin conjugation sites
    Chen, Xiang
    Qiu, Jian-Ding
    Shi, Shao-Ping
    Suo, Sheng-Bao
    Huang, Shu-Yun
    Liang, Ru-Ping
    [J]. BIOINFORMATICS, 2013, 29 (13) : 1614 - 1622
  • [7] dbGSH: a database of S-glutathionylation
    Chen, Yi-Ju
    Lu, Cheng-Tsung
    Lee, Tzong-Yi
    Chen, Yu-Ju
    [J]. BIOINFORMATICS, 2014, 30 (16) : 2386 - 2388
  • [8] Toward an Understanding of the Molecular Mechanisms of Barnacle Larval Settlement: A Comparative Transcriptomic Approach
    Chen, Zhang-Fan
    Matsumura, Kiyotaka
    Wang, Hao
    Arellano, Shawn M.
    Yan, Xingcheng
    Alam, Intikhab
    Archer, John A. C.
    Bajic, Vladimir B.
    Qian, Pei-Yuan
    [J]. PLOS ONE, 2011, 6 (07):
  • [9] hCKSAAP_UbSite: Improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties
    Chen, Zhen
    Zhou, Yuan
    Song, Jiangning
    Zhang, Ziding
    [J]. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2013, 1834 (08): : 1461 - 1467
  • [10] UbiProt: a database of ubiquitylated proteins
    Chernorudskiy, Alexander L.
    Garcia, Alejandro
    Eremin, Eugene V.
    Shorina, Anastasia S.
    Kondratieva, Ekaterina V.
    Gainullin, Murat R.
    [J]. BMC BIOINFORMATICS, 2007, 8 (1)