MetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data

被引:6
|
作者
Du, Yuxuan [1 ]
Sun, Fengzhu [1 ]
机构
[1] Univ Southern Calif, Dept Quantitat & Computat Biol, Los Angeles, CA 90007 USA
关键词
SP NOV; GENOME; ALGORITHM;
D O I
10.1038/s41467-023-41209-6
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Metagenomic Hi-C (metaHi-C) can identify contig-to-contig relationships with respect to their proximity within the same physical cell. Shotgun libraries in metaHi-C experiments can be constructed by next-generation sequencing (short-read metaHi-C) or more recent third-generation sequencing (long-read metaHi-C). However, all existing metaHi-C analysis methods are developed and benchmarked on short-read metaHi-C datasets and there exists much room for improvement in terms of more scalable and stable analyses, especially for long-read metaHi-C data. Here we report MetaCC, an efficient and integrative framework for analyzing both short-read and long-read metaHi-C datasets. MetaCC outperforms existing methods on normalization and binning. In particular, the MetaCC normalization module, named NormCC, is more than 3000 times faster than the current state-of-the-art method HiCzin on a complex wastewater dataset. When applied to one sheep gut long-read metaHi-C dataset, MetaCC binning module can retrieve 709 high-quality genomes with the largest species diversity using one single sample, including an expansion of five uncultured members from the order Erysipelotrichales, and is the only binner that can recover the genome of one important species Bacteroides vulgatus. Further plasmid analyses reveal that MetaCC binning is able to capture multi-copy plasmids. The authors develop an integrative and scalable framework to eliminate systematic biases and retrieve high-quality metagenome-assembled genomes using either long-read or short-read metagenomic Hi-C data.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Tandem repeat genotyping using massively parallel second generation sequencing: comparison of short-read and long-read technologies
    Radvanszky, Jan
    Lojova, Ingrid
    Kucharik, Marcel
    Balaz, Andrej
    Kvapilova, Katerina
    Kvapil, Petr
    Brzon, Ondrej
    Kasny, Martin
    Duranova, Terezia
    Forgacova, Natalia
    Hrnciar, Matej
    Holesova, Zuzana
    Martis, Jozef
    Sitarcik, Jozef
    Budis, Jaroslav
    Szemes, Tomas
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 1784 - 1785
  • [32] Short-read and long-read full-length transcriptome of mouse neural stem cells across neurodevelopmental stages
    Chaoqiong Ding
    Xiang Yan
    Mengying Xu
    Ran Zhou
    Yuancun Zhao
    Dan Zhang
    Zongyao Huang
    Zhenzhong Pan
    Peng Xiao
    Huifang Li
    Lu Chen
    Yuan Wang
    Scientific Data, 9
  • [33] Short-read and long-read RNA sequencing of mouse hematopoietic stem cells at bulk and single-cell levels
    Zheng, Xiuran
    Zhang, Dan
    Xu, Mengying
    Zeng, Wanqin
    Zhou, Ran
    Zhang, Yiming
    Tang, Chao
    Chen, Li
    Chen, Lu
    Lin, Jing-Wen
    SCIENTIFIC DATA, 2021, 8 (01)
  • [34] Short-read and long-read full-length transcriptome of mouse neural stem cells across neurodevelopmental stages
    Ding, Chaoqiong
    Yan, Xiang
    Xu, Mengying
    Zhou, Ran
    Zhao, Yuancun
    Zhang, Dan
    Huang, Zongyao
    Pan, Zhenzhong
    Xiao, Peng
    Li, Huifang
    Chen, Lu
    Wang, Yuan
    SCIENTIFIC DATA, 2022, 9 (01)
  • [35] A combination of short-read and long-read RNA sequencing reveals NOVA1's role in telomere biology
    Ludlow, Andrew T.
    Sayed, Mohammed E.
    Slusher, Aaron L.
    Ribick, Mark
    Pancholi, Anisha
    Sereni, Brian
    Qui, Yu
    Tseng, Elizabeth
    Ashby, Meredith
    Corney, David C.
    CANCER RESEARCH, 2019, 79 (13)
  • [36] Oxford nanopore long-read sequencing enables the generation of complete bacterial and plasmid genomes without short-read sequencing
    Zhao, Wenxuan
    Zeng, Wei
    Pang, Bo
    Luo, Ming
    Peng, Yao
    Xu, Jialiang
    Kan, Biao
    Li, Zhenpeng
    Lu, Xin
    FRONTIERS IN MICROBIOLOGY, 2023, 14
  • [37] Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies
    Zhao, Xuefang
    Collins, Ryan L.
    Lee, Wan-Ping
    Weber, Alexandra M.
    Jun, Yukyung
    Zhu, Qihui
    Weisburd, Ben
    Huang, Yongqing
    Audano, Peter A.
    Wang, Harold
    Walker, Mark
    Lowther, Chelsea
    Fu, Jack
    Consortium, Human Genome Structural Variation
    Gerstein, Mark B.
    Devine, Scott E.
    Marschall, Tobias
    Korbel, Jan O.
    Eichler, Evan E.
    Chaisson, Mark J. P.
    Lee, Charles
    Mills, Ryan E.
    Brand, Harrison
    Talkowski, Michael E.
    AMERICAN JOURNAL OF HUMAN GENETICS, 2021, 108 (05) : 919 - 928
  • [38] The roles of different gene expression regulators in acoustic variation in the intermediate horseshoe bat revealed by long-read and short-read RNA sequencing data
    Li, Qianqian
    Wu, Jianyu
    Mao, Xiuguang
    CURRENT ZOOLOGY, 2023, 70 (05): : 575 - 588
  • [39] Short-read and long-read RNA sequencing of mouse hematopoietic stem cells at bulk and single-cell levels
    Xiuran Zheng
    Dan Zhang
    Mengying Xu
    Wanqin Zeng
    Ran Zhou
    Yiming Zhang
    Chao Tang
    Li Chen
    Lu Chen
    Jing-wen Lin
    Scientific Data, 8
  • [40] Integration of Hi-C and long-read sequencing reveals the structure of highly rearranged chromosomes in patients with germline-chromothripsis
    Schoepflin, Robert
    Melo, Uira Souto
    Moeinzadeh, Hossein
    Heller, David
    Heinrich, Verena
    Herztberg, Jakob
    Robson, Michael
    Ringel, Alessa
    Chudzik, Konrad
    Holtgrewe, Manuel
    Alavi, Nico
    Klever, Marius-Konstantin
    Jungnitsch, Julius
    Comak, Emel
    Tuerkmen, Seval
    Duffourd, Yannis
    Faivre, Laurence
    Zuffardi, Orsetta
    Tenconi, Romano
    Kurtas, Nehir Edibe
    Giglio, Sabrina
    Beensen, Volkmar
    Barbi, Gotthold
    Prager, Bettina
    Latos-Bielenska, Anna
    Vogel, Ida
    Bugge, Merete
    Tommerup, Niels
    Spielmann, Malte
    Kalscheuer, Vera M.
    Vitobello, Antonio
    Vingron, Martin
    Mundlos, Stefan
    EUROPEAN JOURNAL OF HUMAN GENETICS, 2022, 30 (SUPPL 1) : 82 - 82