Generation and application of pseudo-long reads for metagenome assembly

被引:0
|
作者
Sim, Mikang [1 ]
Lee, Jongin [1 ]
Wy, Suyeon [1 ]
Park, Nayoung [1 ]
Lee, Daehwan [1 ]
Kwon, Daehong [1 ]
kim, Jaebum [1 ]
机构
[1] Konkuk Univ, Dept Biomed Sci & Engn, 120 Neungdong Ro, Seoul 05029, South Korea
来源
GIGASCIENCE | 2022年 / 11卷
关键词
next-generation sequencing; metagenomic assembly; pseudo-long read;
D O I
暂无
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background Metagenomic assembly using high-throughput sequencing data is a powerful method to construct microbial genomes in environmental samples without cultivation. However, metagenomic assembly, especially when only short reads are available, is a complex and challenging task because mixed genomes of multiple microorganisms constitute the metagenome. Although long read sequencing technologies have been developed and have begun to be used for metagenomic assembly, many metagenomic studies have been performed based on short reads because the generation of long reads requires higher sequencing cost than short reads. Results In this study, we present a new method called PLR-GEN. It creates pseudo-long reads from metagenomic short reads based on given reference genome sequences by considering small sequence variations existing in individual genomes of the same or different species. When applied to a mock community data set in the Human Microbiome Project, PLR-GEN dramatically extended short reads in length of 101 bp to pseudo-long reads with N50 of 33 Kbp and 0.4% error rate. The use of these pseudo-long reads generated by PLR-GEN resulted in an obvious improvement of metagenomic assembly in terms of the number of sequences, assembly contiguity, and prediction of species and genes. Conclusions PLR-GEN can be used to generate artificial long read sequences without spending extra sequencing cost, thus aiding various studies using metagenomes.
引用
收藏
页数:10
相关论文
共 17 条
  • [1] Generation and application of pseudo-long reads for metagenome assembly
    Sim, Mikang
    Lee, Jongin
    Wy, Suyeon
    Park, Nayoung
    Lee, Daehwan
    Kwon, Daehong
    Kim, Jaebum
    GIGASCIENCE, 2022, 11
  • [2] Generation and application of pseudo-long reads for metagenome assembly
    Sim, Mikang
    Lee, Jongin
    Wy, Suyeon
    Park, Nayoung
    Lee, Daehwan
    Kwon, Daehong
    Kim, Jaebum
    GIGASCIENCE, 2022, 11
  • [3] Genome Sequencing and Assembly by Long Reads in Plants
    Li, Changsheng
    Lin, Feng
    An, Dong
    Wang, Wenqin
    Huang, Ruidong
    GENES, 2018, 9 (01):
  • [4] WHATSHAP: Weighted Haplotype Assembly for Future-Generation Sequencing Reads
    Patterson, Murray
    Marschall, Tobias
    Pisanti, Nadia
    Van Iersel, Leo
    Stougie, Leen
    Klau, Gunnar W.
    Schonhuth, Alexander
    JOURNAL OF COMPUTATIONAL BIOLOGY, 2015, 22 (06) : 498 - 509
  • [5] De novo assembly of the complex genome of Nippostrongylus brasiliensis using MinION long reads
    David Eccles
    Jodie Chandler
    Mali Camberis
    Bernard Henrissat
    Sergey Koren
    Graham Le Gros
    Jonathan J. Ewbank
    BMC Biology, 16
  • [6] De novo assembly of the complex genome of Nippostrongylus brasiliensis using MinION long reads
    Eccles, David
    Chandler, Jodie
    Camberis, Mali
    Henrissat, Bernard
    Koren, Sergey
    Le Gros, Graham
    Ewbank, Jonathan J.
    BMC BIOLOGY, 2018, 16
  • [7] Metagenome assembly through clustering of next-generation sequencing data using protein sequences
    Sim, Mikang
    Kim, Jaebum
    JOURNAL OF MICROBIOLOGICAL METHODS, 2015, 109 : 180 - 187
  • [8] Assembly-free genome comparison based on next-generation sequencing reads and variable length patterns
    Matteo Comin
    Michele Schimd
    BMC Bioinformatics, 15
  • [9] Assembly-free genome comparison based on next-generation sequencing reads and variable length patterns
    Comin, Matteo
    Schimd, Michele
    BMC BIOINFORMATICS, 2014, 15
  • [10] Pseudo-Sanger sequencing: massively parallel production of long and near error-free reads using NGS technology
    Jue Ruan
    Lan Jiang
    Zechen Chong
    Qiang Gong
    Heng Li
    Chunyan Li
    Yong Tao
    Caihong Zheng
    Weiwei Zhai
    David Turissini
    Charles H Cannon
    Xuemei Lu
    Chung-I Wu
    BMC Genomics, 14