MetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data

被引:6
|
作者
Du, Yuxuan [1 ]
Sun, Fengzhu [1 ]
机构
[1] Univ Southern Calif, Dept Quantitat & Computat Biol, Los Angeles, CA 90007 USA
关键词
SP NOV; GENOME; ALGORITHM;
D O I
10.1038/s41467-023-41209-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Metagenomic Hi-C (metaHi-C) can identify contig-to-contig relationships with respect to their proximity within the same physical cell. Shotgun libraries in metaHi-C experiments can be constructed by next-generation sequencing (short-read metaHi-C) or more recent third-generation sequencing (long-read metaHi-C). However, all existing metaHi-C analysis methods are developed and benchmarked on short-read metaHi-C datasets and there exists much room for improvement in terms of more scalable and stable analyses, especially for long-read metaHi-C data. Here we report MetaCC, an efficient and integrative framework for analyzing both short-read and long-read metaHi-C datasets. MetaCC outperforms existing methods on normalization and binning. In particular, the MetaCC normalization module, named NormCC, is more than 3000 times faster than the current state-of-the-art method HiCzin on a complex wastewater dataset. When applied to one sheep gut long-read metaHi-C dataset, MetaCC binning module can retrieve 709 high-quality genomes with the largest species diversity using one single sample, including an expansion of five uncultured members from the order Erysipelotrichales, and is the only binner that can recover the genome of one important species Bacteroides vulgatus. Further plasmid analyses reveal that MetaCC binning is able to capture multi-copy plasmids. The authors develop an integrative and scalable framework to eliminate systematic biases and retrieve high-quality metagenome-assembled genomes using either long-read or short-read metagenomic Hi-C data.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] BugSeq: a highly accurate cloud platform for long-read metagenomic analyses
    Fan, Jeremy
    Huang, Steven
    Chorlton, Samuel D.
    BMC BIOINFORMATICS, 2021, 22 (01)
  • [22] Direct Comparative Analysis of a Pharmacogenomics Panel with PacBio Hifi® Long-Read and Illumina Short-Read Sequencing
    Barthelemy, David
    Belmonte, Elodie
    Di Pilla, Laurie
    Bardel, Claire
    Duport, Eve
    Gautier, Veronique
    Payen, Lea
    JOURNAL OF PERSONALIZED MEDICINE, 2023, 13 (12):
  • [23] Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
    Zev N. Kronenberg
    Arang Rhie
    Sergey Koren
    Gregory T. Concepcion
    Paul Peluso
    Katherine M. Munson
    David Porubsky
    Kristen Kuhn
    Kathryn A. Mueller
    Wai Yee Low
    Stefan Hiendleder
    Olivier Fedrigo
    Ivan Liachko
    Richard J. Hall
    Adam M. Phillippy
    Evan E. Eichler
    John L. Williams
    Timothy P. L. Smith
    Erich D. Jarvis
    Shawn T. Sullivan
    Sarah B. Kingan
    Nature Communications, 12
  • [24] Extended haplotype-phasing of long-read de novo genome assemblies using Hi-C
    Kronenberg, Zev N.
    Rhie, Arang
    Koren, Sergey
    Concepcion, Gregory T.
    Peluso, Paul
    Munson, Katherine M.
    Porubsky, David
    Kuhn, Kristen
    Mueller, Kathryn A.
    Low, Wai Yee
    Hiendleder, Stefan
    Fedrigo, Olivier
    Liachko, Ivan
    Hall, Richard J.
    Phillippy, Adam M.
    Eichler, Evan E.
    Williams, John L.
    Smith, Timothy P. L.
    Jarvis, Erich D.
    Sullivan, Shawn T.
    Kingan, Sarah B.
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [25] High-Quality de novo Chromosome-Level Genome Assembly of a Single Bombyx mori With BmNPV Resistance by a Combination of PacBio Long-Read Sequencing, Illumina Short-Read Sequencing, and Hi-C Sequencing
    Tang, Min
    He, Suqun
    Gong, Xun
    Lu, Peng
    Taha, Rehab H.
    Chen, Keping
    FRONTIERS IN GENETICS, 2021, 12
  • [26] Analysis of gene expression for microminipig liver transcriptomes using parallel long-read technology and short-read sequencing
    Sakai, Chizuka
    Iwano, Shunsuke
    Shimizu, Makiko
    Onodera, Jun
    Uchida, Masashi
    Sakurada, Eri
    Yamazaki, Yuri
    Asaoka, Yoshiji
    Imura, Naoko
    Uno, Yasuhiro
    Murayama, Norie
    Hayashi, Ryoji
    Yamazaki, Hiroshi
    Miyamoto, Yohei
    BIOPHARMACEUTICS & DRUG DISPOSITION, 2016, 37 (04) : 220 - 232
  • [27] Targeted DNA-seq and RNA-seq of Reference Samples with Short-read and Long-read Sequencing
    Gong, Binsheng
    Li, Dan
    Labaj, Pawel P.
    Pan, Bohu
    Novoradovskaya, Natalia
    Thierry-Mieg, Danielle
    Thierry-Mieg, Jean
    Chen, Guangchun
    Bergstrom Lucas, Anne
    Lococo, Jennifer S.
    Richmond, Todd A.
    Tseng, Elizabeth
    Kusko, Rebecca
    Happe, Scott
    Mercer, Timothy R.
    Pabon-Pena, Carlos
    Salmans, Michael
    Tilgner, Hagen U.
    Xiao, Wenzhong
    Johann, Donald J.
    Jones, Wendell
    Tong, Weida
    Mason, Christopher E.
    Kreil, David P.
    Xu, Joshua
    SCIENTIFIC DATA, 2024, 11 (01)
  • [28] AML with complex karyotype: extreme genomic complexity revealed by combined long-read sequencing and Hi-C technology
    Klever, Marius-Konstantin
    Straeng, Eric
    Hetzel, Sara
    Jungnitsch, Julius
    Dolnik, Anna
    Schoepffin, Robert
    Schrezenmeier, Jens-Florian
    Schick, Felix
    Blau, Olga
    Westermann, Joerg
    Ruecker, Frank G.
    Xia, Zuyao
    Doehner, Konstanze
    Schrezenmeier, Hubert
    Spielmann, Malte
    Meissner, Alexander
    Melo, Uira Souto
    Mundlos, Stefan
    Bullinger, Lars
    BLOOD ADVANCES, 2023, 7 (21) : 6520 - 6531
  • [29] Genomic profiling of Antarctic geothermal microbiomes using long-read, Hi-C, and single-cell techniques
    Myeong, Nu Ri
    Choe, Yong-Hoe
    Shin, Seung Chul
    Kim, Jinhyun
    Sul, Woo Jun
    Kim, Mincheol
    SCIENTIFIC DATA, 2024, 11 (01)
  • [30] Chromosome-scale assembly of the Sparassis latifolia genome obtained using long-read and Hi-C sequencing
    Yang, Chi
    Ma, Lu
    Xiao, Donglai
    Liu, Xiaoyu
    Jiang, Xiaoling
    Ying, Zhenghe
    Lin, Yanquan
    G3-GENES GENOMES GENETICS, 2021, 11 (08):