Quantitative assessment of relationship between sequence similarity and function similarity

被引:66
|
作者
Joshi, Trupti
Xu, Dong [1 ]
机构
[1] Univ Missouri, Digital Biol Lab, Dept Comp Sci, Columbia, MO 65211 USA
[2] Univ Missouri, Christopher S Bond Life Sci Ctr, Columbia, MO 65211 USA
关键词
D O I
10.1186/1471-2164-8-222
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Comparative sequence analysis is considered as the first step towards annotating new proteins in genome annotation. However, sequence comparison may lead to creation and propagation of function assignment errors. Thus, it is important to perform a thorough analysis for the quality of sequence-based function assignment using large-scale data in a systematic way. Results: We present an analysis of the relationship between sequence similarity and function similarity for the proteins in four model organisms, i. e., Arabidopsis thaliana, Saccharomyces cerevisiae, Caenorrhabditis elegans, and Drosophila melanogaster. Using a measure of functional similarity based on the three categories of Gene Ontology (GO) classifications (biological process, molecular function, and cellular component), we quantified the correlation between functional similarity and sequence similarity measured by sequence identity or statistical significance of the alignment and compared such a correlation against randomly chosen protein pairs. Conclusion: Various sequence-function relationships were identified from BLAST versus PSI-BLAST, sequence identity versus Expectation Value, GO indices versus semantic similarity approaches, and within genome versus between genome comparisons, for the three GO categories. Our study provides a benchmark to estimate the confidence in assignment of functions purely based on sequence similarity.
引用
收藏
页数:10
相关论文
共 50 条
  • [31] Prediction of enzyme function by combining sequence similarity and protein interactions
    Espadaler, Jordi
    Eswar, Narayanan
    Querol, Enrique
    Aviles, Francesc X.
    Sali, Andrej
    Marti-Renom, Marc A.
    Oliva, Baldomero
    BMC BIOINFORMATICS, 2008, 9 (1)
  • [32] Effusion: prediction of protein function from sequence similarity networks
    Yunes, Jeffrey M.
    Babbitt, Patricia C.
    BIOINFORMATICS, 2019, 35 (03) : 442 - 451
  • [33] Revealing Unexplored Sequence-Function Space Using Sequence Similarity Networks
    Copp, Janine N.
    Akiva, Eyal
    Babbit, Patricia C.
    Tokuriki, Nobuhiko
    BIOCHEMISTRY, 2018, 57 (31) : 4651 - 4662
  • [34] Prediction of enzyme function by combining sequence similarity and protein interactions
    Jordi Espadaler
    Narayanan Eswar
    Enrique Querol
    Francesc X Avilés
    Andrej Sali
    Marc A Marti-Renom
    Baldomero Oliva
    BMC Bioinformatics, 9
  • [35] Inverse sequence similarity of proteins does not imply structural similarity
    Lorenzen, S
    Gille, C
    Preissner, R
    Frömmel, C
    FEBS LETTERS, 2003, 545 (2-3) : 105 - 109
  • [36] RELATION BETWEEN SEQUENCE SIMILARITY AND STRUCTURAL SIMILARITY IN PROTEINS - ROLE OF IMPORTANT PROPERTIES OF AMINO-ACIDS
    KIDERA, A
    KONISHI, Y
    OOI, T
    SCHERAGA, HA
    JOURNAL OF PROTEIN CHEMISTRY, 1985, 4 (05): : 265 - 297
  • [37] THE RELATIONSHIP BETWEEN DISTRACTOR SIMILARITY AND THE RECOGNITION OF PRINT ADVERTISEMENTS
    PELTIER, JW
    SCHIBROWSKY, JA
    ADVANCES IN CONSUMER RESEARCH, 1992, 19 : 94 - 100
  • [38] Relationship between σ-entropy and σ-similarity measure of fuzzy sets
    Tong Xiaojun
    Yi Lin
    Tao Hongjiu
    KYBERNETES, 2006, 35 (9-10) : 1382 - 1392
  • [39] MORPHOLOGICAL SIMILARITY AS A CRITERION OF GENETIC RELATIONSHIP BETWEEN LANGUAGES
    Lackner, Jerome A.
    Rowe, John H.
    AMERICAN ANTHROPOLOGIST, 1955, 57 (01) : 126 - 129
  • [40] Probing the relationship between classification error and class similarity
    Ahlqvist, A
    Gahegan, M
    PHOTOGRAMMETRIC ENGINEERING AND REMOTE SENSING, 2005, 71 (12): : 1365 - 1373