PHACTS, a computational approach to classifying the lifestyle of phages

被引:191
作者
McNair, Katelyn [1 ]
Bailey, Barbara A. [2 ]
Edwards, Robert A. [1 ,3 ,4 ]
机构
[1] San Diego State Univ, Computat Sci Res Ctr, San Diego, CA 92182 USA
[2] San Diego State Univ, Dept Math & Stat, San Diego, CA 92182 USA
[3] San Diego State Univ, Dept Comp Sci, San Diego, CA 92182 USA
[4] Argonne Natl Lab, Math & Comp Sci Div, Argonne, IL 60439 USA
基金
美国国家科学基金会;
关键词
EVOLUTIONARY RELATIONSHIPS; BACTERIOPHAGES; TAXONOMY; SEQUENCE; VIRUSES;
D O I
10.1093/bioinformatics/bts014
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Bacteriophages have two distinct lifestyles: virulent and temperate. The virulent lifestyle has many implications for phage therapy, genomics and microbiology. Determining which lifestyle a newly sequenced phage falls into is currently determined using standard culturing techniques. Such laboratory work is not only costly and time consuming, but also cannot be used on phage genomes constructed from environmental sequencing. Therefore, a computational method that utilizes the sequence data of phage genomes is needed. Results: Phage Classification Tool Set (PHACTS) utilizes a novel similarity algorithm and a supervised Random Forest classifier to make a prediction whether the lifestyle of a phage, described by its proteome, is virulent or temperate. The similarity algorithm creates a training set from phages with known lifestyles and along with the lifestyle annotation, trains a Random Forest to classify the lifestyle of a phage. PHACTS predictions are shown to have a 99% precision rate. Availability and implementation: PHACTS was implemented in the PERL programming language and utilizes the FASTA program (Pearson and Lipman, 1988) and the R programming language library 'Random Forest' (Liaw and Weiner, 2010). The PHACTS software is open source and is available as downloadable stand-alone version or can be accessed online as a user-friendly web interface. The source code, help files and online version are available at http://www.phantome.org/PHACTS/.
引用
收藏
页码:614 / 618
页数:5
相关论文
共 19 条
[1]  
[Anonymous], 1912, VARIABILITY MUTABILI
[2]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[3]   EFFECTS OF GROWTH-MEDIUM ON PHAGE PRODUCTION AND INDUCTION IN ESCHERICHIA-COLI K-12 LAMBDA LYSOGENS [J].
CLARK, DW ;
MEYER, HP ;
LEIST, C ;
FIECHTER, A .
JOURNAL OF BIOTECHNOLOGY, 1986, 3 (5-6) :271-280
[4]   Virus particle production in lysogenic bacteria exposed to protozoan grazing [J].
Clarke, KJ .
FEMS MICROBIOLOGY LETTERS, 1998, 166 (02) :177-180
[5]   The use of genomic signature distance between bacteriophages and their hosts displays evolutionary relationships and phage growth cycle determination [J].
Deschavanne, Patrick ;
DuBow, Michael S. ;
Regeard, Christophe .
VIROLOGY JOURNAL, 2010, 7
[6]   Evolutionary relationships among diverse bacteriophages and prophages: All the world's a phage [J].
Hendrix, RW ;
Smith, MCM ;
Burns, RN ;
Ford, ME ;
Hatfull, GF .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (05) :2192-2197
[7]   Phage therapy [J].
Housby, John N. ;
Mann, Nicholas H. .
DRUG DISCOVERY TODAY, 2009, 14 (11-12) :536-540
[8]   Complete genomic sequence of bacteriophage u136: Demonstration of phage heterogeneity within the P335 quasi-species of lactococcal phages [J].
Labrie, S ;
Moineau, S .
VIROLOGY, 2002, 296 (02) :308-320
[9]   Reticulate representation of evolutionary and functional relationships between phage genomes [J].
Lima-Mendez, Gipsi ;
Van Helden, Jacques ;
Toussaint, Ariane ;
Leplae, Raphael .
MOLECULAR BIOLOGY AND EVOLUTION, 2008, 25 (04) :762-777
[10]   A modular view of the bacteriophage genomic space: identification of host and lifestyle marker modules [J].
Lima-Mendez, Gipsi ;
Toussaint, Ariane ;
Leplae, Raphael .
RESEARCH IN MICROBIOLOGY, 2011, 162 (08) :737-746