Conservation of Gene Cassettes among Diverse Viruses of the Human Gut

被引:27
作者
Minot, Samuel [1 ]
Wu, Gary D. [2 ]
Lewis, James D. [3 ]
Bushman, Frederic D. [1 ]
机构
[1] Univ Penn, Dept Microbiol, Perelman Sch Med, Philadelphia, PA 19104 USA
[2] Univ Penn, Div Gastroenterol, Perelman Sch Med, Philadelphia, PA 19104 USA
[3] Univ Penn, Ctr Clin Epidemiol & Biostat, Perelman Sch Med, Philadelphia, PA 19104 USA
关键词
DE-NOVO ASSEMBLER; MODULAR EVOLUTION; PHAGE; DNA; BACTERIOPHAGES; PROTEINS; GENOMICS; SEQUENCE; FEATURES; LAMBDA;
D O I
10.1371/journal.pone.0042342
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Viruses are a crucial component of the human microbiome, but large population sizes, high sequence diversity, and high frequencies of novel genes have hindered genomic analysis by high-throughput sequencing. Here we investigate approaches to metagenomic assembly to probe genome structure in a sample of 5.6 Gb of gut viral DNA sequence from six individuals. Tests showed that a new pipeline based on DeBruijn graph assembly yielded longer contigs that were able to recruit more reads than the equivalent non-optimized, single-pass approach. To characterize gene content, the database of viral RefSeq proteins was compared to the assembled viral contigs, generating a bipartite graph with functional cassettes linking together viral contigs, which revealed a high degree of connectivity between diverse genomes involving multiple genes of the same functional class. In a second step, open reading frames were grouped by their co-occurrence on contigs in a database-independent manner, revealing conserved cassettes of co-oriented ORFs. These methods reveal that free-living bacteriophages, while usually dissimilar at the nucleotide level, often have significant similarity at the level of encoded amino acid motifs, gene order, and gene orientation. These findings thus connect contemporary metagenomic analysis with classical studies of bacteriophage genomic cassettes. Software is available at https://sourceforge.net/projects/optitdba/.
引用
收藏
页数:9
相关论文
共 49 条
[1]   Distant Mimivirus relative with a larger genome highlights the fundamental features of Megaviridae [J].
Arslan, Defne ;
Legendre, Matthieu ;
Seltzer, Virginie ;
Abergel, Chantal ;
Claverie, Jean-Michel .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (42) :17486-17491
[2]  
Botstein D, 1980, Ann N Y Acad Sci, V354, P484, DOI 10.1111/j.1749-6632.1980.tb27987.x
[3]   PROPERTIES OF HYBRIDS BETWEEN SALMONELLA PHAGE-P22 AND COLIPHAGE LAMBDA [J].
BOTSTEIN, D ;
HERSKOWITZ, I .
NATURE, 1974, 251 (5476) :584-589
[4]   Metagenomic analyses of an uncultured viral community from human feces [J].
Breitbart, M ;
Hewson, I ;
Felts, B ;
Mahaffy, JM ;
Nulton, J ;
Salamon, P ;
Rohwer, F .
JOURNAL OF BACTERIOLOGY, 2003, 185 (20) :6220-6223
[5]   Viral diversity and dynamics in an infant gut [J].
Breitbart, Mya ;
Haynes, Matthew ;
Kelley, Scott ;
Angly, Florent ;
Edwards, Robert A. ;
Felts, Ben ;
Mahaffy, Joseph M. ;
Mueller, Jennifer ;
Nulton, James ;
Rayhawk, Steve ;
Rodriguez-Brito, Beltran ;
Salamon, Peter ;
Rohwer, Forest .
RESEARCH IN MICROBIOLOGY, 2008, 159 (05) :367-373
[6]   BLAST plus : architecture and applications [J].
Camacho, Christiam ;
Coulouris, George ;
Avagyan, Vahram ;
Ma, Ning ;
Papadopoulos, Jason ;
Bealer, Kevin ;
Madden, Thomas L. .
BMC BIOINFORMATICS, 2009, 10
[7]   Evaluation of short read metagenomic assembly [J].
Charuvaka, Anveshi ;
Rangwala, Huzefa .
BMC GENOMICS, 2011, 12
[8]   Identifying bacterial genes and endosymbiont DNA with Glimmer [J].
Delcher, Arthur L. ;
Bratke, Kirsten A. ;
Powers, Edwin C. ;
Salzberg, Steven L. .
BIOINFORMATICS, 2007, 23 (06) :673-679
[9]   Search and clustering orders of magnitude faster than BLAST [J].
Edgar, Robert C. .
BIOINFORMATICS, 2010, 26 (19) :2460-2461
[10]  
Hendrix R. W, 1983, LAMBDA, VII