Extracting Coevolutionary Features from Protein Sequences for Predicting Protein-Protein Interactions

被引:36
作者
Hu, Lun [1 ,2 ]
Chan, Keith C. C. [2 ]
机构
[1] Wuhan Univ Technol, Sch Comp Sci & Technol, Wuhan 430070, Hubei, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
关键词
Coevolutionary information; covariations; protein-protein interaction prediction; sequence information; IDENTIFICATION; PURIFICATION; INFORMATION; DATABASE; SUPPORT;
D O I
10.1109/TCBB.2016.2520923
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Knowing the ways proteins interact with each other are crucial to our understanding of the functional mechanisms of proteins. It is for this reason that different approaches have been developed in attempts to predict protein-protein interactions (PPIs) computationally. Among them, the sequence-based approaches are preferred to the others as they do not require any information about protein properties to perform their tasks. Instead, most sequence-based approaches make use of feature extraction methods to extract features directly from protein sequences so that for each protein sequence, we can construct a feature vector. The feature vectors of every pair of proteins are then concatenated to form two classes of interacting and non-interacting proteins. The prediction of whether or not two proteins interact with each other is then formulated as a classification problem. How accurate PPI predictions can be made therefore depends on how good the features are that can be extracted from the protein sequences to allow interacting or noninteracting to be best distinguished. To do so, instead of extracting such features from individual protein sequences independently of the other protein in the same pair, we propose to jointly consider features from both sequences in a protein pair during the feature extraction process through using a novel coevolutionary feature extraction approach called CoFex. Coevolutionary features extracted by CoFex refer to the covariations found at coevolving positions. Based on the presence and absence of these coevolutionary features in the sequences of two proteins, feature vectors can be composed for pairs of proteins rather than individual proteins. The experiment results show that CoFex is a promising feature extraction approach and can improve the performance of PPI prediction.
引用
收藏
页码:155 / 166
页数:12
相关论文
共 48 条
[1]   Kernel methods for predicting protein-protein interactions [J].
Ben-Hur, A ;
Noble, WS .
BIOINFORMATICS, 2005, 21 :I38-I46
[2]   Predicting protein-protein interactions from primary structure [J].
Bock, JR ;
Gough, DA .
BIOINFORMATICS, 2001, 17 (05) :455-460
[3]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[4]   Coevolution in defining the functional specificity [J].
Chakrabarti, Saikat ;
Panchenko, Anna R. .
PROTEINS-STRUCTURE FUNCTION AND BIOINFORMATICS, 2009, 75 (01) :231-240
[5]   LEARNING SEQUENTIAL PATTERNS FOR PROBABILISTIC INDUCTIVE PREDICTION [J].
CHAN, KCC ;
WONG, AKC ;
CHIU, DKY .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1994, 24 (10) :1532-1547
[6]   LIBSVM: A Library for Support Vector Machines [J].
Chang, Chih-Chung ;
Lin, Chih-Jen .
ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2011, 2 (03)
[7]   Prediction of protein-protein interactions using random decision forest framework [J].
Chen, XW ;
Liu, M .
BIOINFORMATICS, 2005, 21 (24) :4394-4400
[8]   THE 2-HYBRID SYSTEM - A METHOD TO IDENTIFY AND CLONE GENES FOR PROTEINS THAT INTERACT WITH A PROTEIN OF INTEREST [J].
CHIEN, CT ;
BARTEL, PL ;
STERNGLANZ, R ;
FIELDS, S .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1991, 88 (21) :9578-9582
[9]   Conservation of gene order: a fingerprint of proteins that physically interact [J].
Dandekar, T ;
Snel, B ;
Huynen, M ;
Bork, P .
TRENDS IN BIOCHEMICAL SCIENCES, 1998, 23 (09) :324-328
[10]   Inferring domain-domain interactions from protein-protein interactions [J].
Deng, MH ;
Mehta, S ;
Sun, FZ ;
Chen, T .
GENOME RESEARCH, 2002, 12 (10) :1540-1548