Protein fold recognition using residue-based alignments of sequence and secondary structure

被引:0
作者
Aydin, Zafer [1 ]
Erdogan, Hakan [2 ]
Altunbasak, Yucel [1 ]
机构
[1] Georgia Inst Technol, Ctr Signal & Image Proc, Centergy 5th Floor, Atlanta, GA 30332 USA
[2] Sabanchi Univ, Fac Engn & Nat Sci, Istanbul, Turkey
来源
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS | 2007年
关键词
protein fold recognition; secondary structure alignment; amino acid alignment; score normalization; gap cost;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Protein structure prediction aims to determine the three-dimensional structure of proteins form their amino acid sequences. When a protein does not have similarity (homology) to any known fold, threading or fold recognition methods are used to predict structure. Fold recognition methods frequently employ secondary structure, solvent accessibility, and evolutionary information to enhance the accuracy and the quality of the predictions. In this paper, we present a residue based alignment method as an alternative to the state-of-the-art SSEA method, originally introduced by Przytycka et al. [1], and further modified by McGuffin et al. [2]. We introduce a residue-based score function, which can incorporate amino acid similarity matrices such as BLOSUM into secondary structure similarity scoring and compute joint alignments. We show that the power of the SSEA method comes from the length normalization instead of the element alignment technique and similar performance can be achieved using residue-based alignments of secondary structures by optimizing gap costs. In simulations with the two benchmark datasets, our method performs slightly better than the SSEA in terms of the fold recognition accuracy. When the secondary structure similarity matrix is combined with the amino acid based BLOSUM30 matrix, the accuracy of our method improves further (4% for the McGuffin set and 10% for the Ding and Dubchak set). The availability of aligning the amino acid and secondary structure sequences in a joint manner offers a better starting point for more elaborate techniques that employ profile-profile alignments and machine learning methods [3,4].
引用
收藏
页码:349 / +
页数:2
相关论文
共 11 条
[1]  
Altschul SF, 1996, METHOD ENZYMOL, V266, P460
[2]   MANIFOLD: protein fold recognition based on secondary structure, sequence similarity and enzyme classification [J].
Bindewald, E ;
Cestaro, A ;
Hesser, J ;
Heiler, M ;
Tosatto, SCE .
PROTEIN ENGINEERING, 2003, 16 (11) :785-789
[3]   Multi-class protein fold recognition using support vector machines and neural networks [J].
Ding, CHQ ;
Dubchak, I .
BIOINFORMATICS, 2001, 17 (04) :349-358
[4]  
DURBIN R, 1981, BIOL SEQUENCE ANAL P
[5]  
GEWEHR JE, 2004, GERM C BIOINF GCB, P141
[6]   Targeting novel folds for structural genomics [J].
McGuffin, LJ ;
Jones, DT .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2002, 48 (01) :44-52
[7]   What are the baselines for protein fold recognition? [J].
McGuffin, LJ ;
Bryson, K ;
Jones, DT .
BIOINFORMATICS, 2001, 17 (01) :63-72
[8]   CATH - a hierarchic classification of protein domain structures [J].
Orengo, CA ;
Michie, AD ;
Jones, S ;
Jones, DT ;
Swindells, MB ;
Thornton, JM .
STRUCTURE, 1997, 5 (08) :1093-1108
[9]  
Przytycka T, 1999, NAT STRUCT BIOL, V6, P672
[10]   Iterative sequence/secondary structure search for protein homologs: comparison with amino acid sequence alignments and application to fold recognition in genome databases [J].
Wallqvist, A ;
Fukunishi, Y ;
Murphy, LR ;
Fadel, A ;
Levy, RM .
BIOINFORMATICS, 2000, 16 (11) :988-1002