Prediction of zinc binding sites in proteins using sequence derived information

被引:18
作者
Srivastava, Abhishikha [1 ]
Kumar, Manish [1 ]
机构
[1] Univ Delhi South Campus, Dept Biophys, Benito Juarez Rd, New Delhi 110021, India
关键词
zinc metal binding site; machine learning; support vector machine; PSSM; fivefold cross-validation; SUPPORT VECTOR MACHINE; SECONDARY STRUCTURE PREDICTION; FUNCTIONAL DOMAIN COMPOSITION; WEB SERVER; STRUCTURAL CLASS; BETA-LACTAMASE; SVM; CLASSIFICATION; RECOGNITION; ALGORITHM;
D O I
10.1080/07391102.2017.1417910
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Zinc is one the most abundant catalytic cofactor and also an important structural component of a large number of metallo-proteins. Hence prediction of zinc metal binding sites in proteins can be a significant step in annotation of molecular function of a large number of proteins. Majority of existing methods for zinc-binding site predictions are based on a data-set of proteins, which has been compiled nearly a decade ago. Hence there is a need to develop zinc-binding site prediction system using the current updated data to include recently added proteins. Herein, we propose a support vector machine-based method, named as ZincBinder, for prediction of zinc metal-binding site in a protein using sequence profile information. The predictor was trained using fivefold cross validation approach and achieved 85.37% sensitivity with 86.20% specificity during training. Benchmarking on an independent non-redundant data-set, which was not used during training, showed better performance of ZincBinder vis-a-vis existing methods. Executable versions, source code, sample datasets, and usage instructions are available at
引用
收藏
页码:4413 / 4423
页数:11
相关论文
共 69 条
[11]   Using functional domain composition and support vector machines for prediction of protein subcellular location [J].
Chou, KC ;
Cai, YD .
JOURNAL OF BIOLOGICAL CHEMISTRY, 2002, 277 (48) :45765-45769
[12]   PREDICTION OF PROTEIN STRUCTURAL CLASSES [J].
CHOU, KC ;
ZHANG, CT .
CRITICAL REVIEWS IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, 1995, 30 (04) :275-349
[13]   ProtIdent: A web server for identifying proteases and their types by fusing functional domain and sequential evolution information [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2008, 376 (02) :321-325
[14]   MemType-2L: A Web server for predicting membrane proteins and their types by incorporating evolution information through Pse-PSSM [J].
Chou, Kuo-Chen ;
Shen, Hong-Bin .
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS, 2007, 360 (02) :339-345
[15]   Some remarks on protein attribute prediction and pseudo amino acid composition [J].
Chou, Kuo-Chen .
JOURNAL OF THEORETICAL BIOLOGY, 2011, 273 (01) :236-247
[16]   ZINC PROTEINS - ENZYMES, STORAGE PROTEINS, TRANSCRIPTION FACTORS, AND REPLICATION PROTEINS [J].
COLEMAN, JE .
ANNUAL REVIEW OF BIOCHEMISTRY, 1992, 61 :897-946
[17]   Multi-class protein fold recognition using support vector machines and neural networks [J].
Ding, CHQ ;
Dubchak, I .
BIOINFORMATICS, 2001, 17 (04) :349-358
[18]   Support vector machine classification and validation of cancer tissue samples using microarray expression data [J].
Furey, TS ;
Cristianini, N ;
Duffy, N ;
Bednarski, DW ;
Schummer, M ;
Haussler, D .
BIOINFORMATICS, 2000, 16 (10) :906-914
[19]   Application of Metal Coordination Chemistry To Explore and Manipulate Cell Biology [J].
Haas, Kathryn L. ;
Franz, Katherine J. .
CHEMICAL REVIEWS, 2009, 109 (10) :4921-4960
[20]   The architecture of metal coordination groups in proteins [J].
Harding, MM .
ACTA CRYSTALLOGRAPHICA SECTION D-STRUCTURAL BIOLOGY, 2004, 60 :849-859