Predicting reliable regions in protein sequence alignments

被引:59
作者
Cline, M [1 ]
Hughey, R
Karplus, K
机构
[1] Univ Calif Santa Cruz, Jack Baskin Sch Engn, Ctr Biomol Sci & Engn, Santa Cruz, CA 95064 USA
[2] Univ Calif Santa Cruz, Jack Baskin Sch Engn, Dept Comp Engn, Santa Cruz, CA 95064 USA
基金
美国国家科学基金会;
关键词
D O I
10.1093/bioinformatics/18.2.306
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Protein sequence alignments have a myriad of applications in bioinformatics, including secondary and tertiary structure prediction, homology modeling, and phylogeny. Unfortunately, all alignment methods make mistakes, and mistakes in alignments often yield mistakes in their application. Thus, a method to identify and remove suspect alignment positions could benefit many areas in protein sequence analysis. Results: We tested four predictors of alignment position reliability, including near-optimal alignment information, column score, and secondary structural information. We validated each predictor against a large library of alignments, removing positions predicted as unreliable. Near-optimal alignment information was the best predictor, removing 70% of the substantially-misaligned positions and 58% of the over-aligned positions, while retaining 86% of those aligned accurately.
引用
收藏
页码:306 / 314
页数:9
相关论文
共 27 条
  • [1] ALTSCHUL SF, 1990, J MOL BIOL, V215, P403, DOI 10.1006/jmbi.1990.9999
  • [2] Comparative analysis of seven multiple protein sequence alignment servers: clues to enhance reliability of predictions
    Briffeuil, P
    Baudoux, G
    Lambert, C
    De Bolle, X
    Vinals, C
    Feytmans, E
    Depiereux, E
    [J]. BIOINFORMATICS, 1998, 14 (04) : 357 - 366
  • [3] Dopazo J, 1997, COMPUT APPL BIOSCI, V13, P313
  • [4] Gerstein M, 1998, PROTEIN SCI, V7, P445
  • [5] GILBRAT J, 1996, CURR OPIN STRUC BIOL, V6, P377
  • [6] Dali/FSSP classification of three-dimensional protein folds
    Holm, L
    Sander, C
    [J]. NUCLEIC ACIDS RESEARCH, 1997, 25 (01) : 231 - 234
  • [7] Dynamic programming alignment accuracy
    Holmes, I
    Durbin, R
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1998, 5 (03) : 493 - 504
  • [8] Hughey R, 1996, COMPUT APPL BIOSCI, V12, P95
  • [9] Progress in protein structure prediction
    Jones, DT
    [J]. CURRENT OPINION IN STRUCTURAL BIOLOGY, 1997, 7 (03) : 377 - 387
  • [10] Hidden Markov models for detecting remote protein homologies
    Karplus, K
    Barrett, C
    Hughey, R
    [J]. BIOINFORMATICS, 1998, 14 (10) : 846 - 856