A bi-objective function optimization approach for multiple sequence alignment using genetic algorithm

被引:12
作者
Chowdhury, Biswanath [1 ]
Garai, Gautam [2 ]
机构
[1] Univ Calcutta, Dept Biophys Mol Biol & Bioinformat, Kolkata, India
[2] Saha Inst Nucl Phys, Comp Sect, Kolkata, India
关键词
Multiple sequence alignment; Genetic algorithm; Integer coding; Selection; Wilcoxon sign test; Experimental comparison; Bi-objective function; HIDDEN MARKOV-MODELS; PROTEIN; ACCURACY; IMPROVEMENT; BENCHMARK; COLONY; MAFFT;
D O I
10.1007/s00500-020-04917-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multiple sequence alignment (MSA) is characterized as a very high computational complex problem. Therefore, MSA problem cannot be solved by exhaustive methods. Nowadays, MSA is being solved by optimizing more than one objective simultaneously. In this paper, we propose a new genetic algorithm based alignment technique, named bi-objective sequence alignment using genetic algorithm (BSAGA). The novelty of this approach is its selection process. One part of the population is selected based on the Sum of Pair, and rest is selected based on Total Conserve Columns. We applied integer-based chromosomal coding to represent only the gap positions in an alignment. Such representation improves the search technique to reach an optimum even for longer sequences. We tested and compared the alignment score of BSAGA with other relevant alignment techniques on BAliBASE and SABmark. The BSAGA shows better performance than others do, which was further proved by the Wilcoxon sign test.
引用
收藏
页码:15871 / 15888
页数:18
相关论文
共 60 条
[1]   The Protein Data Bank [J].
Berman, HM ;
Westbrook, J ;
Feng, Z ;
Gilliland, G ;
Bhat, TN ;
Weissig, H ;
Shindyalov, IN ;
Bourne, PE .
NUCLEIC ACIDS RESEARCH, 2000, 28 (01) :235-242
[2]   Fast Statistical Alignment [J].
Bradley, Robert K. ;
Roberts, Adam ;
Smoot, Michael ;
Juvekar, Sudeep ;
Do, Jaeyoung ;
Dewey, Colin ;
Holmes, Ian ;
Pachter, Lior .
PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (05)
[3]  
Corder G.W., 2009, FOREMAN NONPARAMETRI
[4]   A fast and elitist multiobjective genetic algorithm: NSGA-II [J].
Deb, K ;
Pratap, A ;
Agarwal, S ;
Meyarivan, T .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2002, 6 (02) :182-197
[5]   Pareto Optimal Pairwise Sequence Alignment [J].
DeRonne, Kevin W. ;
Karypis, George .
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2013, 10 (02) :481-493
[6]   ProbCons: Probabilistic consistency-based multiple sequence alignment [J].
Do, CB ;
Mahabhashyam, MSP ;
Brudno, M ;
Batzoglou, S .
GENOME RESEARCH, 2005, 15 (02) :330-340
[7]  
Do CB, 2008, METHODS MOL BIOL, V484, P379, DOI 10.1007/978-1-59745-398-1_25
[8]   MUSCLE: multiple sequence alignment with high accuracy and high throughput [J].
Edgar, RC .
NUCLEIC ACIDS RESEARCH, 2004, 32 (05) :1792-1797
[9]  
Ehrgott M, 2005, Multicriteria optimization
[10]   Shuffled frog-leaping algorithm: a memetic meta-heuristic for discrete optimization [J].
Eusuff, M ;
Lansey, K ;
Pasha, F .
ENGINEERING OPTIMIZATION, 2006, 38 (02) :129-154