An Efficient Encoding for Simplified Protein Structure Prediction Using Genetic Algorithms

被引:0
作者
Shatabda, Swakkhar [1 ]
Newton, M. A. Hakim [1 ]
Rashid, Mahmood A. [1 ]
Sattar, Abdul [1 ]
机构
[1] Griffith Univ, IIIS, Nathan, Qld 4111, Australia
来源
2013 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC) | 2013年
关键词
Genetic Algorithm; Encoding; Protein Structure Prediction; Simplified Models; HP Model; Lattice; FOLDING SIMULATIONS; LATTICE; SEARCH; MODELS;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Protein structure prediction is one of the most challenging problems in computational biology and remains unsolved for many decades. In a simplified version of the problem, the task is to find a self-avoiding walk with the minimum free energy assuming a discrete lattice and a given energy matrix. Genetic algorithms currently produce the state-of-the-art results for simplified protein structure prediction. However, performance of the genetic algorithms largely depends on the encodings they use in representing protein structures and the twin removal technique they use in eliminating duplicate solutions from the current population. In this paper, we present a new efficient encoding for protein structures. Our encoding is non-isomorphic in nature and results into efficient twin removal. This helps the search algorithm diversify and explore a larger area of the search space. In addition to this, we also propose an approximate matching scheme for removing near-similar solutions from the population. Our encoding algorithm is generic and applicable to any lattice type. On the standard benchmark proteins, our techniques significantly improve the state-of-the-art genetic algorithm for hydrophobic-polar (HP) energy model on face-centered-cubic (FCC) lattice.
引用
收藏
页码:1217 / 1224
页数:8
相关论文
共 42 条
[1]   PRINCIPLES THAT GOVERN FOLDING OF PROTEIN CHAINS [J].
ANFINSEN, CB .
SCIENCE, 1973, 181 (4096) :223-230
[2]  
[Anonymous], [No title captured]
[3]  
[Anonymous], 2000, P PAC S BIOC
[4]   Protein structure prediction and structural genomics [J].
Baker, D ;
Sali, A .
SCIENCE, 2001, 294 (5540) :93-96
[5]  
Berger B, 1998, P 2 ANN INT C COMP M, P30
[6]  
BLAZEWICZ J, 2004, COMPUTATIONAL METHOD, V10, P7
[7]  
Bornberg-Bauer E., 1997, P 1 ANN INT C COMPUT, P47, DOI DOI 10.1145/267521.267528
[8]  
Cebrian Manuel., 2008, AAAI 08, P241
[9]   Mathematics - Packing challenge mastered at last [J].
Cipra, B .
SCIENCE, 1998, 281 (5381) :1267-1267
[10]   On the complexity of protein folding [J].
Crescenzi, P ;
Goldman, D ;
Papadimitriou, C ;
Piccolboni, A ;
Yannakakis, M .
JOURNAL OF COMPUTATIONAL BIOLOGY, 1998, 5 (03) :423-465