Analysis of the effects of multiple sequence alignments in protein secondary structure prediction

被引:0
|
作者
Pappas, GJ
Subramaniam, S
机构
[1] Univ Catolica Brasilia, Biotechnol & Genomic Sci Program, Brasilia, DF, Brazil
[2] Univ Calif San Diego, Dept Bioengn, La Jolla, CA 92093 USA
[3] Univ Calif San Diego, Dept Chem, La Jolla, CA 92093 USA
[4] Univ Calif San Diego, Dept Biochem, La Jolla, CA 92093 USA
来源
ADVANCES IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, PROCEEDINGS | 2005年 / 3594卷
关键词
EVOLUTIONARY INFORMATION; HOMOLOGOUS SEQUENCES; ACCURACY; DATABASE; CODE;
D O I
暂无
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Secondary structure prediction methods are widely used bioinformatics algorithms providing initial insights about protein structure from sequence information. Significant efforts to improve the prediction accuracy over the past years were made, specially the incorporation of information from multiple sequence alignments. This motivated the search for the factors contributing for this improvement. We show that in two of the highly ranked secondary structure prediction methods, DSC and PREDATOR, the use of multiple alignments consistently improves the prediction accuracy as compared to the use of single sequences. This is validated by using different measures of accuracy, which also permit to identify that helical regions benefit the most from alignments, whereas beta-strands seem to have reached a plateau in terms of predictability. Also, the origins of this improvement is explored in terms of sequence specificity, secondary structure composition and the extent of sequence similarity which provides the optimal performance. It is found that divergent sequences, in the identity range of 25-55% provide the largest accuracy gain and that above 65% identity there is almost no advantage in using multiple alignments.
引用
收藏
页码:128 / 140
页数:13
相关论文
共 50 条