Extracting Coevolutionary Features from Protein Sequences for Predicting Protein-Protein Interactions

被引:36
作者
Hu, Lun [1 ,2 ]
Chan, Keith C. C. [2 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan 430070, Hubei, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
关键词
Coevolutionary information; covariations; protein-protein interaction prediction; sequence information; IDENTIFICATION; PURIFICATION; INFORMATION; DATABASE; SUPPORT;
D O I
10.1109/TCBB.2016.2520923
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Knowing the ways proteins interact with each other are crucial to our understanding of the functional mechanisms of proteins. It is for this reason that different approaches have been developed in attempts to predict protein-protein interactions (PPIs) computationally. Among them, the sequence-based approaches are preferred to the others as they do not require any information about protein properties to perform their tasks. Instead, most sequence-based approaches make use of feature extraction methods to extract features directly from protein sequences so that for each protein sequence, we can construct a feature vector. The feature vectors of every pair of proteins are then concatenated to form two classes of interacting and non-interacting proteins. The prediction of whether or not two proteins interact with each other is then formulated as a classification problem. How accurate PPI predictions can be made therefore depends on how good the features are that can be extracted from the protein sequences to allow interacting or noninteracting to be best distinguished. To do so, instead of extracting such features from individual protein sequences independently of the other protein in the same pair, we propose to jointly consider features from both sequences in a protein pair during the feature extraction process through using a novel coevolutionary feature extraction approach called CoFex. Coevolutionary features extracted by CoFex refer to the covariations found at coevolving positions. Based on the presence and absence of these coevolutionary features in the sequences of two proteins, feature vectors can be composed for pairs of proteins rather than individual proteins. The experiment results show that CoFex is a promising feature extraction approach and can improve the performance of PPI prediction.
引用
收藏
页码:155 / 166
页数:12
相关论文
共 48 条
[41]  
Valencia Alfonso, 2003, Methods Biochem Anal, V44, P411
[42]   Database resources of the National Center for Biotechnology Information [J].
Wheeler, David L. ;
Barrett, Tanya ;
Benson, Dennis A. ;
Bryant, Stephen H. ;
Canese, Kathi ;
Chetvernin, Vyacheslav ;
Church, Deanna M. ;
DiCuccio, Michael ;
Edgar, Ron ;
Federhen, Scott ;
Geer, Lewis Y. ;
Kapustin, Yuri ;
Khovayko, Oleg ;
Landsman, David ;
Lipman, David J. ;
Madden, Thomas L. ;
Maglott, Donna R. ;
Ostell, James ;
Miller, Vadim ;
Pruitt, Kim D. ;
Schuler, Gregory D. ;
Sequeira, Edwin ;
Sherry, Steven T. ;
Sirotkin, Karl ;
Souvorov, Alexandre ;
Starchenko, Grigory ;
Tatusov, Roman L. ;
Tatusova, Tatiana A. ;
Wagner, Lukas ;
Yaschenko, Eugene .
NUCLEIC ACIDS RESEARCH, 2007, 35 :D5-D12
[43]   Hares and tortoises: The high- versus low-throughput proteomic race [J].
Wilkins, Marc R. .
ELECTROPHORESIS, 2009, 30 :S150-S155
[44]   DIP: the Database of Interacting Proteins [J].
Xenarios, I ;
Rice, DW ;
Salwinski, L ;
Baron, MK ;
Marcotte, EM ;
Eisenberg, D .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :289-291
[45]   Using manifold embedding for assessing and predicting protein interactions from high-throughput experimental data [J].
You, Zhu-Hong ;
Lei, Ying-Ke ;
Gui, Jie ;
Huang, De-Shuang ;
Zhou, Xiaobo .
BIOINFORMATICS, 2010, 26 (21) :2744-2751
[46]   Simple sequence-based kernels do not predict protein-protein interactions [J].
Yu, Jiantao ;
Guo, Maozu ;
Needham, Chris J. ;
Huang, Yangchao ;
Cai, Lu ;
Westhead, David R. .
BIOINFORMATICS, 2010, 26 (20) :2610-2614
[47]   LocFuse: Human protein-protein interaction prediction via classifier fusion using protein localization information [J].
Zahiri, Javad ;
Mohammad-Noori, Morteza ;
Ebrahimpour, Reza ;
Saadat, Samaneh ;
Bozorgmehr, Joseph H. ;
Goldberg, Tatyana ;
Masoudi-Nejad, Ali .
GENOMICS, 2014, 104 (06) :496-503
[48]   PPIevo: Protein-protein interaction prediction from PSSM based evolutionary information [J].
Zahiri, Javad ;
Yaghoubi, Omid ;
Mohammad-Noori, Morteza ;
Ebrahimpour, Reza ;
Masoudi-Nejad, Ali .
GENOMICS, 2013, 102 (04) :237-242