A comprehensive survey on genetic algorithms for DNA motif prediction

被引:23
作者
Lee, Nung Kion [1 ]
Li, Xi [2 ]
Wang, Dianhui [3 ]
机构
[1] Univ Malaysia Sarawak, Fac Cognit Sci & Human Dev, Sarawak, Malaysia
[2] Australia Natl Univ, John Curtin Sch Med Res, Canberra, ACT, Australia
[3] La Trobe Univ Melbourne, Dept Comp Sci & Informat Technol, Melbourne, Vic, Australia
关键词
Genetic algorithm; DNA motif prediction; FACTOR-BINDING SITES; TRANSCRIPTIONAL REGULATORY ELEMENTS; COMPUTATIONAL IDENTIFICATION; INFORMATION-CONTENT; DISCOVERY; SPECIFICITY; REGIONS; SEQUENCES; PIPELINE; SIGNALS;
D O I
10.1016/j.ins.2018.07.004
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Computational DNA motif discovery is important because it allows for speedy and cost effective analysis of sequences enriched with DNA motifs, performs large scale comparative studies, and tests hypotheses on biological problems. In this work, we provide a comprehensive survey on DNA motif discovery using genetic algorithm (GA). According to the ways of how the solution domain are encoded, we categorize existing GA-based motif discovery techniques into search for consensus and search by position (matrix). Within each category, we make distinctive algorithmic comparisons based on model representations, fitness functions, genetic operators, data post-processing, as well as the experimental results. Moreover, we discuss the strengths and weaknesses of different approaches with recommendations for practical use. This survey paper is useful as guideline for practitioners who would like to design GA solutions for DNA motif prediction in the future. (C) 2018 Elsevier Inc. All rights reserved.
引用
收藏
页码:25 / 43
页数:19
相关论文
共 98 条
  • [11] BAILEY TL, 1995, MACH LEARN, V21, P51, DOI 10.1007/BF00993379
  • [12] A Survey and Comparative Study of Statistical Tests for Identifying Differential Expression from Microarray Data
    Bandyopadhyay, Sanghamitra
    Mallik, Saurav
    Mukhopadhyay, Anirban
    [J]. IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2014, 11 (01) : 95 - 115
  • [13] SELECTION OF DNA-BINDING SITES BY REGULATORY PROTEINS - STATISTICAL-MECHANICAL THEORY AND APPLICATION TO OPERATORS AND PROMOTERS
    BERG, OG
    VONHIPPEL, PH
    [J]. JOURNAL OF MOLECULAR BIOLOGY, 1987, 193 (04) : 723 - 743
  • [14] Bi CP, 2007, 2007 IEEE SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE IN BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, P275
  • [15] Analysis of Genomic Sequence Motifs for Deciphering Transcription Factor Binding and Transcriptional Regulation in Eukaryotic Cells
    Boeva, Valentina
    [J]. FRONTIERS IN GENETICS, 2016, 7
  • [16] Approaches to the automatic discovery of patterns in biosequences
    Brazma, A
    Jonassen, I
    Eidhammer, I
    Gilbert, D
    [J]. JOURNAL OF COMPUTATIONAL BIOLOGY, 1998, 5 (02) : 279 - 305
  • [17] BRENOWITZ M, 1986, METHOD ENZYMOL, V130, P132
  • [18] Cao Q, 2016, COMPUTATIONAL BIOLOGY AND BIOINFORMATICS: GENE REGULATION GENE RNA PROTEIN EPIGENETICS, P3
  • [19] Chan TM, 2007, GECCO 2007: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOL 1 AND 2, P377
  • [20] TFBS identification based on genetic algorithm with combined representations and adaptive post-processing
    Chan, Tak-Ming
    Leung, Kwong-Sak
    Lee, Kin-Hong
    [J]. BIOINFORMATICS, 2008, 24 (03) : 341 - 349