GreenHill: a de novo chromosome-level scaffolding and phasing tool using Hi-C

被引:4
作者
Ouchi, Shun [1 ]
Kajitani, Rei [1 ]
Itoh, Takehiko [1 ]
机构
[1] Tokyo Inst Technol, Sch Life Sci & Technol, 2-12-1 Ookayama,Meguro Ku, Tokyo 1528550, Japan
关键词
Genome assembly; Haplotype; Hi-C; Scaffolding; Phasing;
D O I
10.1186/s13059-023-03006-8
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Chromosome-level haplotype-resolved genome assembly is an important resource in molecular biology. However, current de novo haplotype assemblers require parental data or reference genomes and often fail to provide chromosome-level results. We present GreenHill, a novel scaffolding and phasing tool that considers various assemblers' contigs as input to reconstruct chromosome-level haplotypes using Hi-C without parental or reference data. Its unique functions include new error correction based on Hi-C contacts and the simultaneous use of Hi-C and long reads. Benchmarks reveal that GreenHill outperforms other approaches in contiguity and phasing accuracy, and the majority of chromosome arms are entirely phased.
引用
收藏
页数:27
相关论文
共 66 条
[51]   The importance of phase information for human genomics [J].
Tewhey, Ryan ;
Bansal, Vikas ;
Torkamani, Ali ;
Topol, Eric J. ;
Schork, Nicholas J. .
NATURE REVIEWS GENETICS, 2011, 12 (03) :215-223
[52]  
The FlyBase Consortium/Berkeley Drosophila Genome Project/Celera Genomics, 2014, DROS MEL REL 6 PLUS
[53]  
University of Adelaide, 2018, BOS IND BOS TAUR UOA
[54]  
University of Adelaide, 2018, NCBI BIOPROJECT
[55]  
University of Adelaide, 2018, BOS IND X BOS TAUR U
[56]  
University of California - Irvine, 2018, ASME340174V1 U CAL I
[57]  
University of Washington, 2015, CAEL CB4856 1 0 NCBI
[58]  
Vertebrate Genomes Project, 2019, MEL UND
[59]  
Vertebrate Genomes Project, 2022, AC RUTH
[60]  
Vertebrate Genomes Project, 2022, DIC BIC