CRAM 3.1: advances in the CRAM file format

被引:15
作者
Bonfield, James K. [1 ]
机构
[1] Wellcome Sanger Inst, Informat & Digital Solut, Wellcome Genome Campus, Hinxton CB10 1SA, England
基金
英国惠康基金;
关键词
COMPRESSION;
D O I
10.1093/bioinformatics/btac010
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: CRAM has established itself as a high compression alternative to the BAM file format for DNA sequencing data. We describe updates to further improve this on modern sequencing instruments. Results: With Illumina data CRAM 3.1 is 7-15% smaller than the equivalent CRAM 3.0 file, and 50-70% smaller than the corresponding BAM file. Long-read technology shows more modest compression due to the presence of high-entropy signals.
引用
收藏
页码:1497 / 1503
页数:7
相关论文
共 29 条
[1]   ENANO: Encoder for NANOpore FASTQ files [J].
Alvarez, Guillermo Dufort Y. ;
Seroussi, Gadiel ;
Smircich, Pablo ;
Sotelo, Jose ;
Ochoa, Idoia ;
Martin, Alvaro .
BIOINFORMATICS, 2020, 36 (16) :4506-4507
[2]  
[Anonymous], 2011, FASTER MORE ACCURATE
[3]  
Bliss B., 2018, GENIE MPEG G CONFORM
[4]   HTSlib: C library for reading/writing high-throughput sequencing data [J].
Bonfield, James K. ;
Marshall, John ;
Danecek, Petr ;
Li, Heng ;
Ohan, Valeriu ;
Whitwham, Andrew ;
Keane, Thomas ;
Davies, Robert M. .
GIGASCIENCE, 2021, 10 (02)
[5]   Crumble: reference free lossy compression of sequence quality values [J].
Bonfield, James K. ;
McCarthy, Shane A. ;
Durbin, Richard .
BIOINFORMATICS, 2019, 35 (02) :337-339
[6]   The Scramble conversion tool [J].
Bonfield, James K. .
BIOINFORMATICS, 2014, 30 (19) :2818-2819
[7]   Compression of FASTQ and SAM Format Sequencing Data [J].
Bonfield, James K. ;
Mahoney, Matthew V. .
PLOS ONE, 2013, 8 (03)
[8]   Cram-JS']JS: reference-based decompression in node and the browser [J].
Buels, Robert ;
Dider, Shihab ;
Diesh, Colin ;
Robinson, James ;
Holmes, Ian .
BIOINFORMATICS, 2019, 35 (21) :4451-4452
[9]   Lossy compression of quality scores in genomic data [J].
Canovas, Rodrigo ;
Moffat, Alistair ;
Turpin, Andrew .
BIOINFORMATICS, 2014, 30 (15) :2130-2136
[10]   Facing growth in the European Nucleotide Archive [J].
Cochrane, Guy ;
Alako, Blaise ;
Amid, Clara ;
Bower, Lawrence ;
Cerdeno-Tarraga, Ana ;
Cleland, Iain ;
Gibson, Richard ;
Goodgame, Neil ;
Jang, Mikyung ;
Kay, Simon ;
Leinonen, Rasko ;
Lin, Xiu ;
Lopez, Rodrigo ;
McWilliam, Hamish ;
Oisel, Arnaud ;
Pakseresht, Nima ;
Pallreddy, Swapna ;
Park, Youngmi ;
Plaister, Sheila ;
Radhakrishnan, Rajesh ;
Riviere, Stephane ;
Rossello, Marc ;
Senf, Alexander ;
Silvester, Nicole ;
Smirnov, Dmitriy ;
ten Hoopen, Petra ;
Toribio, Ana ;
Vaughan, Daniel ;
Zalunin, Vadim .
NUCLEIC ACIDS RESEARCH, 2013, 41 (D1) :D30-D35