A sequence-based computational method for prediction of MoRFs

被引:6
|
作者
Wang, Yu [1 ]
Guo, Yanzhi [1 ]
Pu, Xuemei [1 ]
Li, Menglong [1 ]
机构
[1] Sichuan Univ, Coll Chem, Chengdu 610064, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
MOLECULAR RECOGNITION FEATURES; INTRINSICALLY DISORDERED PROTEINS; SECONDARY STRUCTURE; WEB SERVER; BINDING; REGIONS; KNN;
D O I
10.1039/c6ra27161h
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Molecular recognition features (MoRFs) are relatively short segments (10-70 residues) within intrinsically disordered regions (IDRs) that can undergo disorder-to-order transitions during binding to partner proteins. Since MoRFs play key roles in important biological processes such as signaling and regulation, identifying them is crucial for a full understanding of the functional aspects of the IDRs. However, given the relative sparseness of MoRFs in protein sequences, the accuracy of the available MoRF predictors is often inadequate for practical usage, which leaves a significant need and room for improvement. In this work, we developed a novel sequence-based predictor for MoRFs using a support vector machine (SVM) algorithm. First, we constructed a comprehensive dataset of annotated MoRFs with the wide length between 10 and 70 residues. Our method firstly utilized the flanking regions to define the negative samples. Then, amino acid composition (AAC) and two previously unexplored features including composition, transition and distribution (CTD) and K nearest neighbors (KNN) score were used to characterize sequence information of MoRFs. Finally, using five-fold cross-validation, an overall accuracy of 75.75% was achieved through feature evaluation and optimization. When performed on an independent test set of 110 proteins, the method also yielded a promising accuracy of 64.98%. Additionally, through external validation on the negative samples, our method still shows comparative performance with other existing methods. We believe that this study will be useful in elucidating the mechanism of MoRFs and facilitating hypothesis-driven experimental design and validation.
引用
收藏
页码:18937 / 18945
页数:9
相关论文
共 50 条
  • [31] An evaluation of different classification algorithms for protein sequence-based reverse vaccinology prediction
    Heinson, Ashley I.
    Ewing, Rob M.
    Holloway, John W.
    Woelk, Christopher H.
    Niranjan, Mahesan
    PLOS ONE, 2019, 14 (12):
  • [32] A Sequence-based Approach for Predicting Protein Disordered Regions
    Huang, Tao
    He, Zhi-Song
    Cui, Wei-Ren
    Cai, Yu-Dong
    Shi, Xiao-He
    Hu, Le-Le
    Chou, Kuo-Chen
    PROTEIN AND PEPTIDE LETTERS, 2013, 20 (03) : 243 - 248
  • [33] TargetAntiAngio: A Sequence-Based Tool for the Prediction and Analysis of Anti-Angiogenic Peptides
    Laengsri, Vishuda
    Nantasenamat, Chanin
    Schaduangrat, Nalini
    Nuchnoi, Pornlada
    Prachayasittikul, Virapong
    Shoombuatong, Watshara
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2019, 20 (12)
  • [34] Sequence-Based Prediction of DNA-Binding Residues in Proteins with Conservation and Correlation Information
    Ma, Xin
    Guo, Jing
    Liu, Hong-De
    Xie, Jian-Ming
    Sun, Xiao
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (06) : 1766 - 1775
  • [35] Sequence-based prediction of protein protein interaction using a deep-learning algorithm
    Sun, Tanlin
    Zhou, Bo
    Lai, Luhua
    Pei, Jianfeng
    BMC BIOINFORMATICS, 2017, 18
  • [36] RAPID: Fast and accurate sequence-based prediction of intrinsic disorder content on proteomic scale
    Yan, Jing
    Mizianty, Marcin J.
    Filipow, Paul L.
    Uversky, Vladimir N.
    Kurgan, Lukasz
    BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS, 2013, 1834 (08): : 1671 - 1680
  • [37] Improved sequence-based prediction of interaction sites in α-helical transmembrane proteins by deep learning
    Sun, Jianfeng
    Frishman, Dmitrij
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2021, 19 : 1512 - 1530
  • [38] Sequence-based prediction of DNA-binding sites on DNA-binding proteins
    Gou, Z.
    Hwang, S.
    Kuznetsov, B., I
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON BIOINFORMATICS OF GENOME REGULATION AND STRUCTURE, VOL 1, 2006, : 268 - +
  • [39] SeqSVM: A Sequence-Based Support Vector Machine Method for Identifying Antioxidant Proteins
    Xu, Lei
    Liang, Guangmin
    Shi, Shuhua
    Liao, Changrui
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2018, 19 (06):
  • [40] Prediction of Intrinsically Disordered Proteins with a Low Computational Complexity Method
    Yang, Jia
    Liu, Haiyuan
    He, Hao
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2020, 125 (01): : 111 - 123