MetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data

被引：6

作者：

Du, Yuxuan ^{[1
]}

Sun, Fengzhu ^{[1
]}

机构：

[1] Univ Southern Calif, Dept Quantitat & Computat Biol, Los Angeles, CA 90007 USA

来源：

NATURE COMMUNICATIONS | 2023年 / 14卷 / 01期

关键词：

SP NOV; GENOME; ALGORITHM;

D O I：

10.1038/s41467-023-41209-6

中图分类号：

O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

Metagenomic Hi-C (metaHi-C) can identify contig-to-contig relationships with respect to their proximity within the same physical cell. Shotgun libraries in metaHi-C experiments can be constructed by next-generation sequencing (short-read metaHi-C) or more recent third-generation sequencing (long-read metaHi-C). However, all existing metaHi-C analysis methods are developed and benchmarked on short-read metaHi-C datasets and there exists much room for improvement in terms of more scalable and stable analyses, especially for long-read metaHi-C data. Here we report MetaCC, an efficient and integrative framework for analyzing both short-read and long-read metaHi-C datasets. MetaCC outperforms existing methods on normalization and binning. In particular, the MetaCC normalization module, named NormCC, is more than 3000 times faster than the current state-of-the-art method HiCzin on a complex wastewater dataset. When applied to one sheep gut long-read metaHi-C dataset, MetaCC binning module can retrieve 709 high-quality genomes with the largest species diversity using one single sample, including an expansion of five uncultured members from the order Erysipelotrichales, and is the only binner that can recover the genome of one important species Bacteroides vulgatus. Further plasmid analyses reveal that MetaCC binning is able to capture multi-copy plasmids. The authors develop an integrative and scalable framework to eliminate systematic biases and retrieve high-quality metagenome-assembled genomes using either long-read or short-read metagenomic Hi-C data.

引用

页数：12

共 50 条

[41] Germline chromothripsis: Integration of Hi-C and long-read sequencing reveals the structure of highly rear-ranged chromosomes
Schopflin, R.
Melo, U. Souto
Heller, D.
Jungnitsch, J.
Klever, M.
Holtgrewe, M.
Comak, E.
Heinrich, V.
Herztberg, J.
Acuna-Hidalgo, R.
Turkmen, S.
Bugge, M.
Vogel, I.
Beensen, V.
Barbi, G.
Prager, B.
Latos-Bielenska, A.
Tommerup, N.
Kalscheuer, V. M.
Spielmann, M.
Vingron, M.
Mundlos, S.
EUROPEAN JOURNAL OF HUMAN GENETICS, 2020, 28 (SUPPL 1) : 566 - 567
[42] Comparison of short-read and long-read next-generation sequencing technologies for determining HIV-1 drug resistance
Vellas, Camille
Doudou, Amira
Mohamed, Sofiane
Raymond, Stephanie
Jeanne, Nicolas
Latour, Justine
Demmou, Sofia
Ranger, Noemie
Gonzalez, Dimitri
Delobel, Pierre
Izopet, Jacques
JOURNAL OF MEDICAL VIROLOGY, 2024, 96 (10)
[43] Pacbio HiFi long-read genomes offer better exomes by unlocking retinal disease variants missed by short-read sequencing
Karakaya, Kadin
Kroell-Hermi, Ariane
Hiersche, Milan
Decker, Christian
Liakopoulos, Sandra
Preising, Markus
Rohrschneider, Klaus
Betz, Christian
Bolz, Hanno
EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 1328 - 1329
[44] PSI-Sigma: a comprehensive splicing-detection method for short-read and long-read RNA-seq analysis
Lin, Kuan-Ting
Krainer, Adrian R.
BIOINFORMATICS, 2019, 35 (23) : 5048 - 5054
[45] Identification of cell type specific transcript isoforms by integration of bulk short-read, long-read and single cell RNA-seq
Yamamoto, Ryo
Zaitlen, Noah
Xiao, Xinshu
EUROPEAN JOURNAL OF HUMAN GENETICS, 2024, 32 : 1641 - 1641
[46] Mucosal Microbiome in Patients with Early Bowel Polyps: Inferences from Short-Read and Long-Read 16S rRNA Sequencing
Welham, Zoe
Li, Jun
Engel, Alexander F.
Molloy, Mark P.
CANCERS, 2023, 15 (20)
[47] Illumina short-read and MinION long-read WGS to characterize the molecular epidemiology of an NDM-1 Serratia marcescens outbreak in Romania
Phan, H. T. T.
Stoesser, N.
Maciuca, I. E.
Toma, F.
Szekely, E.
Flonta, M.
Hubbard, A. T. M.
Pankhurst, L.
Do, T.
Peto, T. E. A.
Walker, A. S.
Crook, D. W.
Timofte, D.
JOURNAL OF ANTIMICROBIAL CHEMOTHERAPY, 2018, 73 (03) : 672 - 679
[48] Combination of long-read and short-read sequencing provides comprehensive transcriptome and new insight for Chrysanthemum morifolium ray-floret colorization
Kishi-Kaboshi, Mitsuko
Tanaka, Tsuyoshi
Sasaki, Katsutomo
Noda, Naonobu
Aida, Ryutaro
SCIENTIFIC REPORTS, 2022, 12 (01)
[49] UNRAVELING THE ROLE OF DE NOVO STRUCTURAL VARIANTS IN SCHIZOPHRENIA THROUGH COMPREHENSIVE WHOLE GENOME SEQUENCING WITH LONG-READ AND SHORT-READ TECHNOLOGIES
Zhang, Yamin
Li, Tong
Yang, Shaozhong
Xie, Zhi
Li, Tao
EUROPEAN NEUROPSYCHOPHARMACOLOGY, 2024, 87 : 13 - 14
[50] Long-read and short-read RNA-seq reveal the transcriptional regulation characteristics of PICK1 in Baoshan pig testis
Zhang, Xia
Huo, Hailong
Fu, Guowen
Li, Changyao
Lin, Wan
Dai, Hongmei
Xi, Xuemin
Zhai, Lan
Yuan, Qingting
Zhao, Guiying
Huo, Jinlong
ANIMAL REPRODUCTION, 2024, 21 (04)

← 1 2 3 4 5 →