Finding finer functions for partially characterized proteins by protein-protein interaction networks

被引:0
作者
LI YanHui1
2 Department of Bioinformatics
机构
基金
中国国家自然科学基金;
关键词
protein-protein interaction; Gene Ontology; gene function; algorithm; prediction;
D O I
暂无
中图分类号
Q51 [蛋白质];
学科分类号
071010 ; 081704 ;
摘要
Based on high-throughput data, numerous algorithms have been designed to find functions of novel proteins. However, the effectiveness of such algorithms is currently limited by some fundamental factors, including (1) the low a-priori probability of novel proteins participating in a detailed function; (2) the huge false data present in high-throughput datasets; (3) the incomplete data coverage of functional classes; (4) the abundant but heterogeneous negative samples for training the algorithms; and (5) the lack of detailed functional knowledge for training algorithms. Here, for partially characterized proteins, we suggest an approach to finding their finer functions based on protein interaction sub-networks or gene expression patterns, defined in function-specific subspaces. The proposed approach can lessen the above-mentioned problems by properly defining the prediction range and functionally filtering the noisy data, and thus can efficiently find proteins’ novel functions. For thousands of yeast and human proteins partially characterized, it is able to reliably find their finer functions (e.g., the translational functions) with more than 90% precision. The predicted finer functions are highly valuable both for guiding the follow-up wet-lab validation and for providing the necessary data for training algorithms to learn other proteins.
引用
收藏
页码:3363 / 3370
页数:8
相关论文
共 11 条
  • [1] Widely predicting specific protein functions based on protein-protein interaction data and gene expression profile[J] . Lei Gao,Xia Li,Zheng Guo,MingZhu Zhu,YanHui Li,ShaoQi Rao.Science in China Series C: Life Sciences . 2007 (1)
  • [2] Identifying disease feature genes based on cellular localized gene functional modules and regulation networks[J] . Min Zhang,Jing Zhu,Zheng Guo,Xia Li,Da Yang,Lei Wang,Shaoqi Rao.Chinese Science Bulletin . 2006 (15)
  • [3] Classification ensembles for unbalanced class sizes in predictive toxicology
    Chen, JJ
    Tsai, CA
    Young, JF
    Kodell, RL
    [J]. SAR AND QSAR IN ENVIRONMENTAL RESEARCH, 2005, 16 (06) : 517 - 529
  • [4] Analysis and application of large-scale protein-protein interaction data sets[J] . Jingchun Sun,Jinlin Xu,Yixue Li,Tieliu Shi.Chinese Science Bulletin . 2005 (20)
  • [5] Broadly predicting specific gene functions with expression similarity and taxonomy similarity[J] . Hui Yu,Lei Gao,Kang Tu,Zheng Guo.Gene . 2005
  • [6] Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction[J] . Ronald Jansen,Mark Gerstein.Current Opinion in Microbiology . 2004 (5)
  • [7] Learnability-based further prediction of gene functions in Gene Ontology
    Tu, K
    Yu, H
    Guo, Z
    Li, X
    [J]. GENOMICS, 2004, 84 (06) : 922 - 928
  • [8] Functional and topological characterization of protein interaction networks
    Yook, SH
    Oltvai, ZN
    Barabási, AL
    [J]. PROTEOMICS, 2004, 4 (04) : 928 - 942
  • [9] Missing value estimation methods for DNA microarrays[J] . Olga Troyanskaya,Michael Cantor,Gavin Sherlock,Pat Brown,Trevor Hastie,Robert Tibshirani,David Botstein.Bioinformatics . 2001
  • [10] The catalytic subunit of protein phosphatase 2A associates with the translation termination factor eRF1
    Andjelkovic, N
    Zolnierowicz, S
    VanHoof, C
    Goris, J
    Hemmings, BA
    [J]. EMBO JOURNAL, 1996, 15 (24) : 7156 - 7167