Sequence-based prediction of DNA-binding sites on DNA-binding proteins

被引:0
|
作者
Gou, Z. [1 ]
Hwang, S. [1 ]
Kuznetsov, B., I [1 ]
机构
[1] SUNY Albany, Gen NY Sis Ctr Excellence Canc Genom, One Discovery Dr, Rensselaer, NY USA
关键词
protein-DNA interaction; position specific scoring matrix; evolutionary conservation; web-server; DNA binding; prediction; pattern recognition; machine learning;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Identification of DNA-binding sites on DNA-binding proteins is important for functional annotation. Experimental determination of the structure of a protein-DNA complex is an expensive process. Reliable computational methods that utilize the sequence of a DNA-binding protein to predict its DNA-binding interface are needed. Results: We present an application of three machine learning methods: support vector machine, kernel logistic regression, and penalized logistic regression to predict DNA-binding sites on a DNA-binding protein using its amino acid sequence as an input. Prediction is performed using either single sequence or a profile of evolutionary conservation. The performance of our predictors is better than that of other existing sequence-based methods. The outputs of all three individual methods are combined to obtain a consensus prediction. This further improves performance and results in accuracy of 82.4%, sensitivity of 84.9% and specificity of 83.1% for the strict consensus prediction. Availability: http://lcg.rit.albany.edu/dp-bind
引用
收藏
页码:268 / +
页数:2
相关论文
共 50 条