Identifying essential proteins based on protein domains in protein-protein interaction networks

被引:0
作者
Wang, Jianxin [1 ]
Peng, Wei [1 ,2 ]
Chen, Yingjiao [1 ]
Lu, Yu [1 ]
Pan, Yi
机构
[1] Cent S Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China
[2] Kunming Univ Sci & Technol, Fac Informat Engn & Automat, Kunming 6500939, Yunnan, Peoples R China
来源
2013 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM) | 2013年
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Essential proteins; protein domain; protein-protein interaction networks; PREDICTING ESSENTIAL PROTEINS; ESSENTIAL GENES; CENTRALITY; IDENTIFICATION; INTEGRATION; DATABASE; PFAM;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Prediction of essential proteins which are crucial to an organism survival is important for disease analysis and drug design, as well as the understanding of cellular life. The majority of prediction methods infer the possibility of proteins to be essential by using the network topology. However, these methods are limited to the complementation of available protein-protein interaction (PPI) data and depend on the network accuracy. To overcome these limitation, some computational methods have been proposed while seldom of them solve this problem by taking consideration of protein domains. In this work, we firstly analyze the correlation between the essentiality of proteins and their domain features based on data of 13 species. We find that the proteins containing more protein domain types which rarely occur in other proteins tend to be essential. Accordingly we propose a new prediction method, named UDoNC, by combining the domain features of proteins with their topological properties in PPI network. In UDoNC, the essentiality of proteins is decided by the number and the frequency of their protein domain types, as well as the essentiality of their adjacent edges measured by edge clustering coefficient. The experimental results on S. cerevisiae data show that UDoNC outperforms other existing methods in terms of area under the curve (AUC).
引用
收藏
页数:6
相关论文
共 39 条
[1]   Towards the prediction of essential genes by integration of network topology, cellular localization and biological process information [J].
Acencio, Marcio L. ;
Lemke, Ney .
BMC BIOINFORMATICS, 2009, 10 :290
[2]  
BONACICH P, 1987, AM J SOCIOL, V92, P1170, DOI 10.1086/228631
[3]   SGD:: Saccharomyces Genome Database [J].
Cherry, JM ;
Adler, C ;
Ball, C ;
Chervitz, SA ;
Dwight, SS ;
Hester, ET ;
Jia, YK ;
Juvik, G ;
Roe, T ;
Schroeder, M ;
Weng, SA ;
Botstein, D .
NUCLEIC ACIDS RESEARCH, 1998, 26 (01) :73-79
[4]   Targeting virulence: a new paradigm for antimicrobial therapy [J].
Clatworthy, Anne E. ;
Pierson, Emily ;
Hung, Deborah T. .
NATURE CHEMICAL BIOLOGY, 2007, 3 (09) :541-548
[5]   Genome-wide screening for gene function using RNAi in mammalian cells [J].
Cullen, LM ;
Arndt, GM .
IMMUNOLOGY AND CELL BIOLOGY, 2005, 83 (03) :217-223
[6]   Investigating the predictability of essential genes across distantly related organisms using an integrative approach [J].
Deng, Jingyuan ;
Deng, Lei ;
Su, Shengchang ;
Zhang, Minlu ;
Lin, Xiaodong ;
Wei, Lan ;
Minai, Ali A. ;
Hassett, Daniel J. ;
Lu, Long J. .
NUCLEIC ACIDS RESEARCH, 2011, 39 (03) :795-807
[7]   THE MULTIPLICITY OF DOMAINS IN PROTEINS [J].
DOOLITTLE, RF .
ANNUAL REVIEW OF BIOCHEMISTRY, 1995, 64 :287-314
[8]   Subgraph centrality in complex networks -: art. no. 056103 [J].
Estrada, E ;
Rodríguez-Velázquez, JA .
PHYSICAL REVIEW E, 2005, 71 (05)
[9]   The Pfam protein families database [J].
Finn, Robert D. ;
Mistry, Jaina ;
Tate, John ;
Coggill, Penny ;
Heger, Andreas ;
Pollington, Joanne E. ;
Gavin, O. Luke ;
Gunasekaran, Prasad ;
Ceric, Goran ;
Forslund, Kristoffer ;
Holm, Liisa ;
Sonnhammer, Erik L. L. ;
Eddy, Sean R. ;
Bateman, Alex .
NUCLEIC ACIDS RESEARCH, 2010, 38 :D211-D222
[10]   Differences in the evolutionary history of disease genes affected by dominant or recessive mutations [J].
Furney, Simon J. ;
Alba, M. Mar ;
Lopez-Bigas, Nuria .
BMC GENOMICS, 2006, 7 (1)