Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly

被引:32
|
作者
Holley, Guillaume [1 ]
Beyter, Doruk [1 ]
Ingimundardottir, Helga [1 ]
Moller, Peter L. [2 ]
Kristmundsdottir, Snodis [1 ,3 ]
Eggertsson, Hannes P. [1 ]
Halldorsson, Bjarni, V [1 ,3 ]
机构
[1] Amgen Inc, deCODE Genet, Reykjavik, Iceland
[2] Aarhus Univ, Dept Biomed, Aarhus, Denmark
[3] Reykjavik Univ, Sch Technol, Reykjavik, Iceland
关键词
GENOME; LIBRARY;
D O I
10.1186/s13059-020-02244-4
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.
引用
收藏
页数:22
相关论文
共 13 条
  • [1] Efficient Hybrid De Novo Error Correction and Assembly for Long Reads
    Kchouk, Mehdi
    Elloumi, Mourad
    2016 27TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2016, : 88 - 92
  • [2] Hybrid Error Correction approach and DeNovo Assembly for MinIon Sequencing Long Reads
    Kchouk, Mehdi
    Elloumi, Mourad
    2016 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2016, : 122 - 125
  • [3] Jabba: hybrid error correction for long sequencing reads
    Miclotte, Giles
    Heydari, Mahdi
    Demeester, Piet
    Rombauts, Stephane
    Van de Peer, Yves
    Audenaert, Pieter
    Fostier, Jan
    ALGORITHMS FOR MOLECULAR BIOLOGY, 2016, 11
  • [4] Efficient assembly of nanopore reads via highly accurate and intact error correction
    Chen, Ying
    Nie, Fan
    Xie, Shang-Qian
    Zheng, Ying-Feng
    Dai, Qi
    Bray, Thomas
    Wang, Yao-Xin
    Xing, Jian-Feng
    Huang, Zhi-Jian
    Wang, De-Peng
    He, Li-Juan
    Luo, Feng
    Wang, Jian-Xin
    Liu, Yi-Zhi
    Xiao, Chuan-Le
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [5] Local read haplotagging enables accurate long-read small variant calling
    Kolesnikov, Alexey
    Cook, Daniel
    Nattestad, Maria
    Brambrink, Lucas
    McNulty, Brandy
    Gorzynski, John
    Goenka, Sneha
    Ashley, Euan A.
    Jain, Miten
    Miga, Karen H.
    Paten, Benedict
    Chang, Pi-Chuan
    Carroll, Andrew
    Shafin, Kishwar
    NATURE COMMUNICATIONS, 2024, 15 (01)
  • [6] A comparative evaluation of hybrid error correction methods for error-prone long reads
    Fu, Shuhua
    Wang, Anqi
    Au, Kin Fai
    GENOME BIOLOGY, 2019, 20 (1)
  • [7] Dysgu: efficient structural variant calling using short or long reads
    Cleal, Kez
    Baird, Duncan M.
    NUCLEIC ACIDS RESEARCH, 2022, 50 (09) : E53
  • [8] HASLR: Fast Hybrid Assembly of Long Reads
    Haghshenas, Ehsan
    Asghari, Hossein
    Stoye, Jens
    Chauve, Cedric
    Hach, Faraz
    ISCIENCE, 2020, 23 (08)
  • [9] Performance difference of graph-based and alignment-based hybrid error correction methods for error-prone long reads
    Wang, Anqi
    Au, Kin Fai
    GENOME BIOLOGY, 2020, 21 (01)
  • [10] Local assembly of long reads enables phylogenomics of transposable elements in a polyploid cell line
    Han, Shunhua
    Dias, Guilherme B.
    Basting, Preston J.
    Viswanatha, Raghuvir
    Perrimon, Norbert
    Bergman, Casey M.
    NUCLEIC ACIDS RESEARCH, 2022, 50 (21) : E124