Complex genome assembly based on long-read sequencing

被引:8
作者
Zhang, Tianjiao [1 ]
Zhou, Jie [1 ]
Gao, Wentao [1 ]
Jia, Yuran [1 ]
Wei, Yanan [1 ]
Wang, Guohua [1 ]
机构
[1] Northeast Forestry Univ China, Coll Informat & Comp Engn, Harbin, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
genome assembly; haplotype; long-read sequencing; DE-BRUIJN GRAPHS; ACCURATE;
D O I
10.1093/bib/bbac305
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
High-quality genome chromosome-scale sequences provide an important basis for genomics downstream analysis, especially the construction of haplotype-resolved and complete genomes, which plays a key role in genome annotation, mutation detection, evolutionary analysis, gene function research, comparative genomics and other aspects. However, genome-wide short-read sequencing is difficult to produce a complete genome in the face of a complex genome with high duplication and multiple heterozygosity. The emergence of long-read sequencing technology has greatly improved the integrity of complex genome assembly. We review a variety of computational methods for complex genome assembly and describe in detail the theories, innovations and shortcomings of collapsed, semi-collapsed and uncollapsed assemblers based on long reads. Among the three methods, uncollapsed assembly is the most correct and complete way to represent genomes. In addition, genome assembly is closely related to haplotype reconstruction, that is uncollapsed assembly realizes haplotype reconstruction, and haplotype reconstruction promotes uncollapsed assembly. We hope that gapless, telomere-to-telomere and accurate assembly of complex genomes can be truly routinely achieved using only a simple process or a single tool in the future.
引用
收藏
页数:11
相关论文
共 69 条
[1]   New approaches for metagenome assembly with short reads [J].
Ayling, Martin ;
Clark, Matthew D. ;
Leggett, Richard M. .
BRIEFINGS IN BIOINFORMATICS, 2020, 21 (02) :584-594
[2]   HapCHAT: adaptive haplotype assembly for efficiently leveraging high coverage in long reads [J].
Beretta, Stefano ;
Patterson, Murray D. ;
Zaccaria, Simone ;
Della Vedoya, Gianluca ;
Bonizzoni, Paola .
BMC BIOINFORMATICS, 2018, 19
[3]   Haplotype phasing: existing methods and new developments [J].
Browning, Sharon R. ;
Browning, Brian L. .
NATURE REVIEWS GENETICS, 2011, 12 (10) :703-714
[4]   Genome Mapping in Plant Comparative Genomics [J].
Chaney, Lindsay ;
Sharp, Aaron R. ;
Evans, Carrie R. ;
Udall, Joshua A. .
TRENDS IN PLANT SCIENCE, 2016, 21 (09) :770-780
[5]   Allele-aware chromosome-level genome assembly and efficient transgene-free genome editing for the autotetraploid cultivated alfalfa [J].
Chen, Haitao ;
Zeng, Yan ;
Yang, Yongzhi ;
Huang, Lingli ;
Tang, Bolin ;
Zhang, He ;
Hao, Fei ;
Li, Wei ;
Li, Youhan ;
Liu, Yanbin ;
Zhang, Xiaoshuang ;
Zhang, Ru ;
Zhang, Yesheng ;
Li, Yongxin ;
Wang, Kun ;
He, Hua ;
Wang, Zhongkai ;
Fan, Guangyi ;
Yang, Hui ;
Bao, Aike ;
Shang, Zhanhuan ;
Chen, Jianghua ;
Wang, Wen ;
Qiu, Qiang .
NATURE COMMUNICATIONS, 2020, 11 (01)
[6]   Terpenoids with anti-influenza activity from the leaves of Euphorbia leucocephala [J].
Chen, Hsiao-Ting ;
Chuang, Chi-Wen ;
Cheng, Ju-Chien ;
Yeh, Yung-Ju ;
Chang, Tsung-Hsien ;
Shi, Yu-Ting ;
Chao, Chih-Hua .
NATURAL PRODUCT RESEARCH, 2023, 37 (06) :936-943
[7]   Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm [J].
Cheng, Haoyu ;
Concepcion, Gregory T. ;
Feng, Xiaowen ;
Zhang, Haowen ;
Li, Heng .
NATURE METHODS, 2021, 18 (02) :170-+
[8]  
Chin C.-S., 2019, HUMAN GENOME ASSEMBL, DOI [DOI 10.1101/705616, 10.1101/705616]
[9]  
Chin CS, 2016, NAT METHODS, V13, P1050, DOI [10.1038/NMETH.4035, 10.1038/nmeth.4035]
[10]   How to apply de Bruijn graphs to genome assembly [J].
Compeau, Phillip E. C. ;
Pevzner, Pavel A. ;
Tesler, Glenn .
NATURE BIOTECHNOLOGY, 2011, 29 (11) :987-991