Paired-end sequencing of Fosmid libraries by Illumina

被引:39
作者
Williams, Louise J. S. [1 ]
Tabbaa, Diana G. [1 ]
Li, Na [1 ]
Berlin, Aaron M. [1 ]
Shea, Terrance P. [1 ]
MacCallum, Iain [1 ]
Lawrence, Michael S. [1 ]
Drier, Yotam [1 ]
Getz, Gad [1 ]
Young, Sarah K. [1 ]
Jaffe, David B. [1 ]
Nusbaum, Chad [1 ]
Gnirke, Andreas [1 ]
机构
[1] Broad Inst MIT & Harvard, Cambridge, MA 02141 USA
基金
美国国家卫生研究院;
关键词
STRUCTURAL VARIATION; GENOME SEQUENCE; SHORT-READ; HUMAN DNA; FRAGMENTS; CLONING;
D O I
10.1101/gr.138925.112
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Eliminating the bacterial cloning step has been a major factor in the vastly improved efficiency of massively parallel sequencing approaches. However, this also has made it a technical challenge to produce the modern equivalent of the Fosmid- or BAC-end sequences that were crucial for assembling and analyzing complex genomes during the Sanger-based sequencing era. To close this technology gap, we developed Fosill, a method for converting Fosmids to Illumina-compatible jumping libraries. We constructed Fosmid libraries in vectors with Illumina primer sequences and specific nicking sites flanking the cloning site. Our family of pFosill vectors allows multiplex Fosmid cloning of end-tagged genomic fragments without physical size selection and is compatible with standard and multiplex paired-end Illumina sequencing. To excise the bulk of each cloned insert, we introduced two nicks in the vector, translated them into the inserts, and cleaved them. Recircularization of the vector via coligation of insert termini followed by inverse PCR generates a jumping library for paired-end sequencing with 101-base reads. The yield of unique Fosmid-sized jumps is sufficiently high, and the background of short, incorrectly spaced and chimeric artifacts sufficiently low, to enable applications such as mapping of structural variation and scaffolding of de novo assemblies. We demonstrate the power of Fosill to map genome rearrangements in a cancer cell line and identified three fusion genes that were corroborated by RNA-seq data. Our Fosill-powered assembly of the mouse genome has an N50 scaffold length of 17.0 Mb, rivaling the connectivity (16.9 Mb) of the Sanger-sequencing based draft assembly.
引用
收藏
页码:2241 / 2249
页数:9
相关论文
共 42 条
[1]   The genome sequence of Drosophila melanogaster [J].
Adams, MD ;
Celniker, SE ;
Holt, RA ;
Evans, CA ;
Gocayne, JD ;
Amanatides, PG ;
Scherer, SE ;
Li, PW ;
Hoskins, RA ;
Galle, RF ;
George, RA ;
Lewis, SE ;
Richards, S ;
Ashburner, M ;
Henderson, SN ;
Sutton, GG ;
Wortman, JR ;
Yandell, MD ;
Zhang, Q ;
Chen, LX ;
Brandon, RC ;
Rogers, YHC ;
Blazej, RG ;
Champe, M ;
Pfeiffer, BD ;
Wan, KH ;
Doyle, C ;
Baxter, EG ;
Helt, G ;
Nelson, CR ;
Miklos, GLG ;
Abril, JF ;
Agbayani, A ;
An, HJ ;
Andrews-Pfannkoch, C ;
Baldwin, D ;
Ballew, RM ;
Basu, A ;
Baxendale, J ;
Bayraktaroglu, L ;
Beasley, EM ;
Beeson, KY ;
Benos, PV ;
Berman, BP ;
Bhandari, D ;
Bolshakov, S ;
Borkova, D ;
Botchan, MR ;
Bouck, J ;
Brokstein, P .
SCIENCE, 2000, 287 (5461) :2185-2195
[2]   Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries [J].
Aird, Daniel ;
Ross, Michael G. ;
Chen, Wei-Sheng ;
Danielsson, Maxwell ;
Fennell, Timothy ;
Russ, Carsten ;
Jaffe, David B. ;
Nusbaum, Chad ;
Gnirke, Andreas .
GENOME BIOLOGY, 2011, 12 (02)
[3]   Accurate whole human genome sequencing using reversible terminator chemistry [J].
Bentley, David R. ;
Balasubramanian, Shankar ;
Swerdlow, Harold P. ;
Smith, Geoffrey P. ;
Milton, John ;
Brown, Clive G. ;
Hall, Kevin P. ;
Evers, Dirk J. ;
Barnes, Colin L. ;
Bignell, Helen R. ;
Boutell, Jonathan M. ;
Bryant, Jason ;
Carter, Richard J. ;
Cheetham, R. Keira ;
Cox, Anthony J. ;
Ellis, Darren J. ;
Flatbush, Michael R. ;
Gormley, Niall A. ;
Humphray, Sean J. ;
Irving, Leslie J. ;
Karbelashvili, Mirian S. ;
Kirk, Scott M. ;
Li, Heng ;
Liu, Xiaohai ;
Maisinger, Klaus S. ;
Murray, Lisa J. ;
Obradovic, Bojan ;
Ost, Tobias ;
Parkinson, Michael L. ;
Pratt, Mark R. ;
Rasolonjatovo, Isabelle M. J. ;
Reed, Mark T. ;
Rigatti, Roberto ;
Rodighiero, Chiara ;
Ross, Mark T. ;
Sabot, Andrea ;
Sankar, Subramanian V. ;
Scally, Aylwyn ;
Schroth, Gary P. ;
Smith, Mark E. ;
Smith, Vincent P. ;
Spiridou, Anastassia ;
Torrance, Peta E. ;
Tzonev, Svilen S. ;
Vermaas, Eric H. ;
Walter, Klaudia ;
Wu, Xiaolin ;
Zhang, Lu ;
Alam, Mohammed D. ;
Anastasi, Carole .
NATURE, 2008, 456 (7218) :53-59
[4]   The genomic complexity of primary human prostate cancer [J].
Berger, Michael F. ;
Lawrence, Michael S. ;
Demichelis, Francesca ;
Drier, Yotam ;
Cibulskis, Kristian ;
Sivachenko, Andrey Y. ;
Sboner, Andrea ;
Esgueva, Raquel ;
Pflueger, Dorothee ;
Sougnez, Carrie ;
Onofrio, Robert ;
Carter, Scott L. ;
Park, Kyung ;
Habegger, Lukas ;
Ambrogio, Lauren ;
Fennell, Timothy ;
Parkin, Melissa ;
Saksena, Gordon ;
Voet, Douglas ;
Ramos, Alex H. ;
Pugh, Trevor J. ;
Wilkinson, Jane ;
Fisher, Sheila ;
Winckler, Wendy ;
Mahan, Scott ;
Ardlie, Kristin ;
Baldwin, Jennifer ;
Simons, Jonathan W. ;
Kitabayashi, Naoki ;
MacDonald, Theresa Y. ;
Kantoff, Philip W. ;
Chin, Lynda ;
Gabriel, Stacey B. ;
Gerstein, Mark B. ;
Golub, Todd R. ;
Meyerson, Matthew ;
Tewari, Ashutosh ;
Lander, Eric S. ;
Getz, Gad ;
Rubin, Mark A. ;
Garraway, Levi A. .
NATURE, 2011, 470 (7333) :214-220
[5]   Integrative analysis of the melanoma transcriptome [J].
Berger, Michael F. ;
Levin, Joshua Z. ;
Vijayendran, Krishna ;
Sivachenko, Andrey ;
Adiconis, Xian ;
Maguire, Jared ;
Johnson, Laura A. ;
Robinson, James ;
Verhaak, Roel G. ;
Sougnez, Carrie ;
Onofrio, Robert C. ;
Ziaugra, Liuda ;
Cibulskis, Kristian ;
Laine, Elisabeth ;
Barretina, Jordi ;
Winckler, Wendy ;
Fisher, David E. ;
Getz, Gad ;
Meyerson, Matthew ;
Jaffe, David B. ;
Gabriel, Stacey B. ;
Lander, Eric S. ;
Dummer, Reinhard ;
Gnirke, Andreas ;
Nusbaum, Chad ;
Garraway, Levi A. .
GENOME RESEARCH, 2010, 20 (04) :413-427
[6]   SEQUENCE AND ANALYSIS OF THE HUMAN ABL GENE, THE BCR GENE, AND REGIONS INVOLVED IN THE PHILADELPHIA CHROMOSOMAL TRANSLOCATION [J].
CHISSOE, SL ;
BODENTEICH, A ;
WANG, YF ;
WANG, YP ;
BURIAN, D ;
CLIFTON, SW ;
CRABTREE, J ;
FREEMAN, A ;
IYER, K ;
LI, JA ;
MA, YC ;
MCLAURY, HJ ;
PAN, HQ ;
SARHAN, OH ;
TOTH, S ;
WANG, ZL ;
ZHANG, GZ ;
HEISTERKAMP, N ;
GROFFEN, J ;
ROE, BA .
GENOMICS, 1995, 27 (01) :67-82
[7]   Finishing the euchromatic sequence of the human genome [J].
Collins, FS ;
Lander, ES ;
Rogers, J ;
Waterston, RH .
NATURE, 2004, 431 (7011) :931-945
[8]   DIRECTIONAL CLONING OF DNA FRAGMENTS AT A LARGE DISTANCE FROM AN INITIAL PROBE - A CIRCULARIZATION METHOD [J].
COLLINS, FS ;
WEISSMAN, SM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA-BIOLOGICAL SCIENCES, 1984, 81 (21) :6812-6816
[9]   Substantial biases in ultra-short read data sets from high-throughput DNA sequencing [J].
Dohm, Juliane C. ;
Lottaz, Claudio ;
Borodina, Tatiana ;
Himmelbauer, Heinz .
NUCLEIC ACIDS RESEARCH, 2008, 36 (16)
[10]   The genome of the social amoeba Dictyostelium discoideum [J].
Eichinger, L ;
Pachebat, JA ;
Glöckner, G ;
Rajandream, MA ;
Sucgang, R ;
Berriman, M ;
Song, J ;
Olsen, R ;
Szafranski, K ;
Xu, Q ;
Tunggal, B ;
Kummerfeld, S ;
Madera, M ;
Konfortov, BA ;
Rivero, F ;
Bankier, AT ;
Lehmann, R ;
Hamlin, N ;
Davies, R ;
Gaudet, P ;
Fey, P ;
Pilcher, K ;
Chen, G ;
Saunders, D ;
Sodergren, E ;
Davis, P ;
Kerhornou, A ;
Nie, X ;
Hall, N ;
Anjard, C ;
Hemphill, L ;
Bason, N ;
Farbrother, P ;
Desany, B ;
Just, E ;
Morio, T ;
Rost, R ;
Churcher, C ;
Cooper, J ;
Haydock, S ;
van Driessche, N ;
Cronin, A ;
Goodhead, I ;
Muzny, D ;
Mourier, T ;
Pain, A ;
Lu, M ;
Harper, D ;
Lindsay, R ;
Hauser, H .
NATURE, 2005, 435 (7038) :43-57