iSillfroTyr-PseAAC: Identify Tyrosine Sulfation Sites by Incorporating Statistical Moments via Chou's 5-steps Rule and Pseudo Components

被引:28
作者
Barukab, Omar [1 ]
Khan, Yaser Daanial [2 ]
Khan, Sher Afzal [1 ,4 ]
Chou, Kuo-Chen [3 ]
机构
[1] King Abdulaziz Univ, Fac Comp & Informat Technol Rabigh, Dept Informat Technol, POB 344, Rabigh 21911, Saudi Arabia
[2] Univ Management & Technol, Sch Syst & Technol, Dept Comp Sci, POB 10033,C-2, Lahore 54770, Pakistan
[3] Gordon Life Sci Inst, Boston, MA 02478 USA
[4] Abdul Wali Khan Univ, Dept Comp Sci, Mardan, Pakistan
关键词
Sulfation; sulfotyrosine; statistical moments; PseAAC; 5-step rule; pseudo components; AMINO-ACID-COMPOSITION; MULTI-LABEL CLASSIFIER; PREDICT SUBCELLULAR-LOCALIZATION; LYSINE SUCCINYLATION SITES; SEQUENCE-BASED PREDICTOR; CRITICAL SPHERICAL-SHELL; S-NITROSYLATION SITES; FLEXIBLE WEB SERVER; ENSEMBLE CLASSIFIER; RECOMBINATION SPOTS;
D O I
10.2174/1389202920666190819091609
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Background: The amino acid residues, in protein, undergo post-translation modification (PTM) during protein synthesis, a process of chemical and physical change in an amino acid that in turn alters behavioral properties of proteins. Tyrosine sulfation is a ubiquitous posttranslational modification which is known to be associated with regulation of various biological functions and pathological processes. Thus its identification is necessary to understand its mechanism. Experimental determination through site-directed mutagenesis and high throughput mass spectrometry is a costly and time taking process, thus, the reliable computational model is required for identification of sulfotyrosine sites. Methodology: In this paper, we present a computational model for the prediction of the sulfotyrosine sites named iSulfoTyr-PseAAC in which feature vectors are constructed using statistical moments of protein amino acid sequences and various position/composition relative features. These features arc incorporated into PseAAC. The model is validated by jackknife, cross-validation, self-consistency and independent testing. Results: Accuracy determined through validation was 93.9.3% for jackknife test, 95.16% for cross validation, 94.3% for self-consistency and 94.3% for independent testing. Conclusion: The proposed model has better performance as compared to the existing predictors, however, the accuracy can he improved further, in future, due to increasing number of sulfotyrosine sites in proteins.
引用
收藏
页码:306 / 320
页数:15
相关论文
共 185 条
[1]   MFSC: Multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou's PseAAC components [J].
Ahmad, Jamal ;
Hayat, Maqsood .
JOURNAL OF THEORETICAL BIOLOGY, 2019, 463 :99-109
[2]   iMethyl-STTNC: Identification of N6-methyladenosine sites by extending the idea of SAAC into Chou's PseAAC to formulate RNA sequences [J].
Akbar, Shahid ;
Hayat, Maqsood .
JOURNAL OF THEORETICAL BIOLOGY, 2018, 455 :205-211
[3]   Prediction of N-linked glycosylation sites using position relative features and statistical moments [J].
Akmal, Muhammad Aizaz ;
Rasool, Nouman ;
Khan, Yaser Daanial .
PLOS ONE, 2017, 12 (08)
[4]  
ALTHAUS IW, 1993, J BIOL CHEM, V268, P14875
[5]  
ALTHAUS IW, 1993, J BIOL CHEM, V268, P6119
[6]   KINETIC-STUDIES WITH THE NONNUCLEOSIDE HUMAN-IMMUNODEFICIENCY-VIRUS TYPE-1 REVERSE-TRANSCRIPTASE INHIBITOR U-90152E [J].
ALTHAUS, IW ;
CHOU, JJ ;
GONZALES, AJ ;
DEIBEL, MR ;
CHOU, KC ;
KEZDY, FJ ;
ROMERO, DL ;
THOMAS, RC ;
ARISTOFF, PA ;
TARPLEY, WG ;
REUSSER, F .
BIOCHEMICAL PHARMACOLOGY, 1994, 47 (11) :2017-2028
[7]  
[Anonymous], BRIEF BIOINFORM
[8]  
[Anonymous], 2017, MOL INM
[9]  
[Anonymous], BIOINFORMATICS
[10]  
[Anonymous], 2013, COMPUT MATH METHOD M, DOI DOI 10.1155/2013/530696