Automatic motif discovery in an enzyme database using a genetic algorithm-based approach

被引:2
|
作者
Tsunoda, DF [1 ]
Lopes, HS [1 ]
机构
[1] CPGEI, Lab Bioinformat, CEFET PR, BR-80230901 Curitiba, Parana, Brazil
关键词
D O I
10.1007/s00500-005-0490-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Proteins can be grouped into families according to some features such as hydrophobicity, composition or structure, aiming to establish the common biological functions. This paper presents a system that was conceived to discover features (particular sequences of amino acids, or motifs) that occur very often in proteins of a given family but rarely occur in proteins of other families. These features can be used for the classification of unknown proteins, that is, to predict their function by analyzing the primary structure. Runnings were done with the enzymes subset extracted from the Protein Data Bank. The heuristic method used was based on a genetic algorithm using specially tailored operators for the problem. Motifs found were used to build a decision tree using the C4.5 algorithm. The results were compared with motifs found by MEME, a freely available web tool. Another comparison was made with classification results of other two systems: a neural network-based tool and a hidden Markov model-based tool. The final performance was measured using sensitivity (Se) and specificity (Sp): similar results were obtained for the proposed tool (78.79 and 95.82) and the neural network-based tool (74.65 and 94.80, respectively), while MEME and HMMER resulted in an inferior performance. The proposed system has the advantage of giving comprehensible rules when compared with the other approaches. These results obtained for the enzyme dataset suggest that the evolutionary computation method proposed is very efficient to find patterns for protein classification.
引用
收藏
页码:325 / 330
页数:6
相关论文
共 50 条
  • [1] Automatic motif discovery in an enzyme database using a genetic algorithm-based approach
    D. F. Tsunoda
    H. S. Lopes
    Soft Computing, 2006, 10 : 325 - 330
  • [2] A genetic algorithm-based clustering approach for database partitioning
    Cheng, CH
    Lee, WK
    Wong, KF
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2002, 32 (03): : 215 - 230
  • [3] A Genetic algorithm-Based Approach for Classification Rule Discovery
    Shi, Xian-Jun
    Lei, Hong
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON INFORMATION MANAGEMENT, INNOVATION MANAGEMENT AND INDUSTRIAL ENGINEERING, VOL 1, 2008, : 175 - 178
  • [4] Motif discovery using an immune genetic algorithm
    Luo Jia-wei
    Wang Ting
    JOURNAL OF THEORETICAL BIOLOGY, 2010, 264 (02) : 319 - 325
  • [5] MDGA: Motif Discovery using a Genetic Algorithm
    Che, Dongsheng
    Song, Yinglei
    Rasheed, Khaled
    GECCO 2005: GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, VOLS 1 AND 2, 2005, : 447 - 452
  • [6] A genetic algorithm-based segmentation for automatic VOP generation
    Kim, EY
    Park, SH
    PROTOCOLS AND SYSTEMS FOR INTERACTIVE DISTRIBUTED MULTIMEDIA, PROCEEDINGS, 2002, 2515 : 106 - 117
  • [7] Optimizing genetic algorithm for motif discovery
    Huo, Hongwei
    Zhao, Zhenhua
    Stojkovic, Vojislav
    Liu, Lifang
    MATHEMATICAL AND COMPUTER MODELLING, 2010, 52 (11-12) : 2011 - 2020
  • [8] A Genetic Algorithm-Based Method for the Automatic Reduction of Reaction Mechanisms
    Sikalo, N.
    Hasemann, O.
    Schulz, C.
    Kempf, A.
    Wlokas, I.
    INTERNATIONAL JOURNAL OF CHEMICAL KINETICS, 2014, 46 (01) : 41 - 59
  • [9] Automatic text summarization with genetic algorithm-based attribute selection
    Silla, CN
    Pappa, GL
    Freitas, AA
    Kaestner, CAA
    ADVANCES IN ARTIFICIAL INTELLIGENCE - IBERAMIA 2004, 2004, 3315 : 305 - 314
  • [10] Motif Discovery in Unaligned DNA Sequences Using Genetic Algorithm
    Muttakin, Al
    Huq, Mohammad Rezwanul
    2017 4TH INTERNATIONAL CONFERENCE ON ADVANCES IN ELECTRICAL ENGINEERING (ICAEE), 2017, : 725 - 730