Structure based prediction of binding residues on DNA-binding proteins

被引:13
作者
Bhardwaj, Nitin [1 ]
Langlois, Robert E. [1 ]
Hui, Guijun Zhao [1 ]
机构
[1] Univ Illinois, Dept Bioengn, Bioinformat Program, Chicago, IL 60607 USA
来源
2005 27TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY, VOLS 1-7 | 2005年
关键词
protein-DNA interaction; function annotation; SVMs; binding site prediction;
D O I
10.1109/IEMBS.2005.1617004
中图分类号
R318 [生物医学工程];
学科分类号
0831 ;
摘要
Annotation of the functional sites on the surface of a protein has been the subject of many studies. In this regard, the search for attributes and features characterizing these sites is of prime consequence. Here, we present an implementation of a kernel-based machine learning protocol for identifying residues on a DNA-binding protein form the interface with the DNA. Sequence and structural features including solvent accessibility, local composition, net charge and electrostatic potentials are examined. These features are then fed into Support Vector Machines (SVM) to predict the DNA-binding residues on the surface of the protein. In order to compare with published work, we predict binding residues by training on other binding and non-binding residues in the same protein for which we achieved an accuracy of 79%. The sensitivity and specificity are 59% and 89%. We also consider a more realistic approach, predicting the binding residues of proteins entirely withheld from the training set achieving values of 66%, 43% and 81%, respectively. Performances reported here are better than other published results. Moreover, since our protocol does not lean on sequence or structural homology, it can be used to annotate unclassified proteins and more generally to identify novel binding sites with no similarity to the known cases.
引用
收藏
页码:2611 / 2614
页数:4
相关论文
共 21 条
[1]   Analysis and prediction of DNA-binding proteins and their binding residues based on composition, sequence and structural information [J].
Ahmad, S ;
Gromiha, MM ;
Sarai, A .
BIOINFORMATICS, 2004, 20 (04) :477-486
[2]   Automated structure-based prediction of functional sites in proteins: Applications to assessing the validity of inheriting protein function from homology in genome annotation and to protein docking [J].
Aloy, P ;
Querol, E ;
Aviles, FX ;
Sternberg, MJE .
JOURNAL OF MOLECULAR BIOLOGY, 2001, 311 (02) :395-408
[3]  
[Anonymous], 2005, LIBSVM LIB SUPPORT V
[4]  
BHARDWAJ N, UNPUB KERNEL BASED M
[5]   CHARMM - A PROGRAM FOR MACROMOLECULAR ENERGY, MINIMIZATION, AND DYNAMICS CALCULATIONS [J].
BROOKS, BR ;
BRUCCOLERI, RE ;
OLAFSON, BD ;
STATES, DJ ;
SWAMINATHAN, S ;
KARPLUS, M .
JOURNAL OF COMPUTATIONAL CHEMISTRY, 1983, 4 (02) :187-217
[6]   Knowledge-based analysis of microarray gene expression data by using support vector machines [J].
Brown, MPS ;
Grundy, WN ;
Lin, D ;
Cristianini, N ;
Sugnet, CW ;
Furey, TS ;
Ares, M ;
Haussler, D .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2000, 97 (01) :262-267
[7]  
Christianini N., 1999, INTRO SUPPORT VECTOR
[8]   Multi-class protein fold recognition using support vector machines and neural networks [J].
Ding, CHQ ;
Dubchak, I .
BIOINFORMATICS, 2001, 17 (04) :349-358
[9]   The relationship between protein structure and function: a comprehensive survey with application to the yeast genome [J].
Hegyi, H ;
Gerstein, M .
JOURNAL OF MOLECULAR BIOLOGY, 1999, 288 (01) :147-164
[10]  
HONIG B, 1989, PROG CLIN BIOL RES, V278, P65