Comprehensive simulation of metagenomic sequencing data with non-uniform sampling distribution

被引:0
作者
Shansong Liu
Kui Hua
Sijie Chen
Xuegong Zhang
机构
[1] MOEKeyLabofBioinformatics,BioinformaticsDivision,TNLISTandDepartmentofAutomation,TsinghuaUniversity
关键词
D O I
暂无
中图分类号
Q811.4 [生物信息论];
学科分类号
0711 ; 0831 ;
摘要
Background: Metagenomic sequencing is a complex sampling procedure from unknown mixtures of many genomes.Having metagenome data with known genome compositions is essential for both benchmarking bioinformatics software and for investigating influences of various factors on the data. Compared to data from real microbiome samples or from defined microbial mock community, simulated data with proper computational models are better for the purpose as they provide more flexibility for controlling multiple factors.Methods: We developed a non-uniform metagenomic sequencing simulation system(nuMetaSim) that is capable of mimicking various factors in real metagenomic sequencing to reflect multiple properties of real data with customizable parameter settings.Results: We generated 9 comprehensive metagenomic datasets with different composition complexity from of 203bacterial genomes and 2 archaeal genomes related with human intestine system.Conclusion: The data can serve as benchmarks for comparing performance of different methods at different situations, and the software package allows users to generate simulation data that can better reflect the specific properties in their scenarios.
引用
收藏
页码:175 / 185
页数:11
相关论文
共 41 条
  • [1] Assessment of metagenomic assembly using simulated next generation sequencing data..[J].Daniel R Mende;Alison S Waller;Shinichi Sunagawa;Aino I Järvelin;Michelle M Chan;Manimozhiyan Arumugam;Jeroen Raes;Peer Bork.PLoS ONE.2017, 2
  • [2] NeSSM: a Next-generation Sequencing Simulator for Metagenomics..[J].Ben Jia;Liming Xuan;Kaiye Cai;Zhiqiang Hu;Liangxiao Ma;Chaochun Wei.PLoS ONE.2017, 10
  • [3] Inference of Environmental Factor-Microbe and Microbe-Microbe Associations from Metagenomic Data Using a Hierarchical Bayesian Statistical Model
    Yang, Yuqing
    Chen, Ning
    Chen, Ting
    [J]. CELL SYSTEMS, 2017, 4 (01) : 129 - +
  • [4] Microbiome Helper: a Custom and Streamlined Workflow for Microbiome Research
    Comeau, Andre M.
    Douglas, Gavin M.
    Langille, Morgan G. I.
    [J]. MSYSTEMS, 2017, 2 (01)
  • [5] VSEARCH: a versatile open source tool for metagenomics..[J].Rognes Torbjørn;Flouri Tomáš;Nichols Ben;Quince Christopher;Mahé Frédéric.PeerJ.2016,
  • [6] mockrobiota: a Public Resource for Microbiome Bioinformatics Benchmarking
    Bokulich, Nicholas A.
    Rideout, Jai Ram
    Mercurio, William G.
    Shiffer, Arron
    Wolfe, Benjamin
    Maurice, Corinne F.
    Dutton, Rachel J.
    Turnbaugh, Peter J.
    Knight, Rob
    Caporaso, J. Gregory
    [J]. MSYSTEMS, 2016, 1 (05)
  • [7] A comparison of tools for the simulation of genomic next-generation sequencing data
    Escalona, Merly
    Rocha, Sara
    Posada, David
    [J]. NATURE REVIEWS GENETICS, 2016, 17 (08) : 459 - 469
  • [8] Evaluating techniques for metagenome annotation using simulated sequence data
    Randle-Boggis, Richard J.
    Helgason, Thorunn
    Sapp, Melanie
    Ashton, Peter D.
    [J]. FEMS MICROBIOLOGY ECOLOGY, 2016, 92 (07)
  • [9] Illumina error profiles: resolving fine-scale variation in metagenomic sequencing data
    Schirmer, Melanie
    D'Amore, Rosalinda
    Ijaz, Umer Z.
    Hall, Neil
    Quince, Christopher
    [J]. BMC BIOINFORMATICS, 2016, 17
  • [10] An evaluation of the accuracy and speed of metagenome analysis tools
    Lindgreen, Stinus
    Adair, Karen L.
    Gardner, Paul P.
    [J]. SCIENTIFIC REPORTS, 2016, 6