Classification of the Adenylation and Acyl-Transferase Activity of NRPS and PKS Systems Using Ensembles of Substrate Specific Hidden Markov Models

被引:68
|
作者
Khayatt, Barzan I. [1 ]
Overmars, Lex [1 ,2 ]
Siezen, Roland J. [1 ,2 ,3 ,4 ]
Francke, Christof [1 ,2 ,3 ,4 ]
机构
[1] Radboud Univ Nijmegen, Med Ctr, Ctr Mol & Biomol Informat, Nijmegen Ctr Mol Life Sci, NL-6525 ED Nijmegen, Netherlands
[2] Netherlands Bioinformat Ctr, Nijmegen, Netherlands
[3] Kluyver Ctr Genom Ind Fermentat, Delft, Netherlands
[4] TI Food & Nutr, Wageningen, Netherlands
来源
PLOS ONE | 2013年 / 8卷 / 04期
关键词
NONRIBOSOMAL PEPTIDE SYNTHETASES; POLYKETIDE SYNTHASES; NATURAL-PRODUCTS; BIOSYNTHESIS; PREDICTION; SEQUENCE; ANTIBIOTICS; ORGANIZATION; RECOGNITION; MECHANISMS;
D O I
10.1371/journal.pone.0062136
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
There is a growing interest in the Non-ribosomal peptide synthetases (NRPSs) and polyketide synthases (PKSs) of microbes, fungi and plants because they can produce bioactive peptides such as antibiotics. The ability to identify the substrate specificity of the enzyme's adenylation (A) and acyl-transferase (AT) domains is essential to rationally deduce or engineer new products. We here report on a Hidden Markov Model (HMM)-based ensemble method to predict the substrate specificity at high quality. We collected a new reference set of experimentally validated sequences. An initial classification based on alignment and Neighbor Joining was performed in line with most of the previously published prediction methods. We then created and tested single substrate specific HMMs and found that their use improved the correct identification significantly for A as well as for AT domains. A major advantage of the use of HMMs is that it abolishes the dependency on multiple sequence alignment and residue selection that is hampering the alignment-based clustering methods. Using our models we obtained a high prediction quality for the substrate specificity of the A domains similar to two recently published tools that make use of HMMs or Support Vector Machines (NRPSsp and NRPS predictor2, respectively). Moreover, replacement of the single substrate specific HMMs by ensembles of models caused a clear increase in prediction quality. We argue that the superiority of the ensemble over the single model is caused by the way substrate specificity evolves for the studied systems. It is likely that this also holds true for other protein domains. The ensemble predictor has been implemented in a simple web-based tool that is available at http://www.cmbi.ru.nl/NRPS-PKS-substrate-predictor/.
引用
收藏
页数:10
相关论文
共 5 条
  • [1] Object trajectory-based activity classification and recognition using hidden Markov models
    Bashir, Faisal I.
    Khokhar, Ashfaq A.
    Schonfeld, Dan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2007, 16 (07) : 1912 - 1919
  • [2] Impact of Wavelet based signal processing methods in radar classification systems using Hidden Markov Models
    Kouemou, G.
    Opitz, F.
    2008 PROCEEDINGS INTERNATIONAL RADAR SYMPOSIUM, 2008, : 265 - 268
  • [3] VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models
    Rangel-Pineros, Guillermo
    Almeida, Alexandre
    Beracochea, Martin
    Sakharova, Ekaterina
    Marz, Manja
    Munoz, Alejandro Reyes
    Hoelzer, Martin
    Finn, Robert D.
    PLOS COMPUTATIONAL BIOLOGY, 2023, 19 (08)
  • [4] Classification of prefrontal activity due to mental arithmetic and music imagery using hidden Markov models and frequency domain near-infrared spectroscopy
    Power, Sarah D.
    Falk, Tiago H.
    Chau, Tom
    JOURNAL OF NEURAL ENGINEERING, 2010, 7 (02)
  • [5] A Depth Video-based Human Detection and Activity Recognition using Multi-features and Embedded Hidden Markov Models for Health Care Monitoring Systems
    Jalal, Ahmad
    Kamal, Shaharyar
    Kim, Daijin
    INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2017, 4 (04): : 54 - 62