Predicting protein-protein interactions from sequence using correlation coefficient and high-quality interaction dataset

被引:71
|
作者
Shi, Ming-Guang [1 ,2 ,3 ]
Xia, Jun-Feng [1 ,4 ]
Li, Xue-Ling [1 ]
Huang, De-Shuang [1 ]
机构
[1] Chinese Acad Sci, Hefei Inst Intelligent Machines, Intelligent Comp Lab, Hefei 230031, Peoples R China
[2] Univ Sci & Technol China, Dept Automat, Hefei 230026, Peoples R China
[3] Hefei Univ Technol, Sch Elect Engn & Automat, Hefei 230009, Peoples R China
[4] Univ Sci & Technol China, Sch Life Sci, Hefei 230026, Peoples R China
基金
美国国家科学基金会; 国家高技术研究发展计划(863计划);
关键词
Protein-protein interactions; Correlation coefficient; Support vector machine; Protein sequence; Gold standard positives dataset; Gold standard negatives dataset; SACCHAROMYCES-CEREVISIAE; SEMANTIC SIMILARITY; INTERACTION MAP; INTERACTION NETWORK; COMPONENT ANALYSIS; GLOBULAR-PROTEINS; AMINO-ACIDS; YEAST; SCALE; HYDROPHOBICITIES;
D O I
10.1007/s00726-009-0295-y
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Identifying protein-protein interactions (PPIs) is critical for understanding the cellular function of the proteins and the machinery of a proteome. Data of PPIs derived from high-throughput technologies are often incomplete and noisy. Therefore, it is important to develop computational methods and high-quality interaction dataset for predicting PPIs. A sequence-based method is proposed by combining correlation coefficient (CC) transformation and support vector machine (SVM). CC transformation not only adequately considers the neighboring effect of protein sequence but describes the level of CC between two protein sequences. A gold standard positives (interacting) dataset MIPS Core and a gold standard negatives (non-interacting) dataset GO-NEG of yeast Saccharomyces cerevisiae were mined to objectively evaluate the above method and attenuate the bias. The SVM model combined with CC transformation yielded the best performance with a high accuracy of 87.94% using gold standard positives and gold standard negatives datasets. The source code of MATLAB and the datasets are available on request under smgsmg@mail.ustc.edu.cn.
引用
收藏
页码:891 / 899
页数:9
相关论文
共 50 条
  • [1] Predicting protein–protein interactions from sequence using correlation coefficient and high-quality interaction dataset
    Ming-Guang Shi
    Jun-Feng Xia
    Xue-Ling Li
    De-Shuang Huang
    Amino Acids, 2010, 38 : 891 - 899
  • [2] Predicting protein-protein interactions using high-quality non-interacting pairs
    Long Zhang
    Guoxian Yu
    Maozu Guo
    Jun Wang
    BMC Bioinformatics, 19
  • [3] Predicting protein-protein interactions using high-quality non-interacting pairs
    Zhang, Long
    Yu, Guoxian
    Guo, Maozu
    Wang, Jun
    BMC BIOINFORMATICS, 2018, 19
  • [4] Predicting Protein-Protein Interactions Using Correlation Coefficient and Principle Component Analysis
    Thanathamathee, Putthiporn
    Lursinsap, Chidchanok
    2009 3RD INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICAL ENGINEERING, VOLS 1-11, 2009, : 732 - +
  • [5] Genome-wide inference of protein interaction sites: lessons from the yeast high-quality negative protein-protein interaction dataset
    Guo, Jie
    Wu, Xiaomei
    Zhang, Da-Yong
    Lin, Kui
    NUCLEIC ACIDS RESEARCH, 2008, 36 (06) : 2002 - 2011
  • [6] Effect of the quality of the interaction data on predicting protein function from protein-protein interactions
    Qing-Shan Ni
    Zheng-Zhi Wang
    Gang-Guo Li
    Guang-Yun Wang
    Ying-Jie Zhao
    Interdisciplinary Sciences: Computational Life Sciences, 2009, 1 : 40 - 45
  • [7] Effect of the Quality of the Interaction Data on Predicting Protein Function from Protein-protein Interactions
    Ni, Qing-Shan
    Wang, Zheng-Zhi
    Li, Gang-Guo
    Wang, Guang-Yun
    Zhao, Ying-Jie
    INTERDISCIPLINARY SCIENCES-COMPUTATIONAL LIFE SCIENCES, 2009, 1 (01) : 40 - 45
  • [8] Sequence Representations and Their Utility for Predicting Protein-Protein Interactions
    Kimothi, Dhananjay
    Biyani, Pravesh
    Hogan, James M.
    Davis, Melissa J.
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2023, 20 (01) : 646 - 657
  • [9] Modeling Protein-Protein Interface Interactions as a Means for Predicting Protein-Protein Interaction Partners
    Reyes, Vicente M.
    JOURNAL OF BIOMOLECULAR STRUCTURE & DYNAMICS, 2009, 26 (06): : 873 - 873
  • [10] Predicting protein-protein interactions from protein sequences using meta predictor
    Xia, Jun-Feng
    Zhao, Xing-Ming
    Huang, De-Shuang
    AMINO ACIDS, 2010, 39 (05) : 1595 - 1599