DSRC 2-Industry-oriented compression of FASTQ files

被引:67
|
作者
Roguski, Lukasz [1 ]
Deorowicz, Sebastian [2 ]
机构
[1] Polish Japanese Inst Informat Technol, PL-02008 Warsaw, Poland
[2] Silesian Tech Univ, Inst Informat, PL-44100 Gliwice, Poland
关键词
READS;
D O I
10.1093/bioinformatics/btu208
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Modern sequencing platforms produce huge amounts of data. Archiving them raises major problems but is crucial for reproducibility of results, one of the most fundamental principles of science. The widely used gzip compressor, used for reduction of storage and transfer costs, is not a perfect solution, so a few specialized FASTQ compressors were proposed recently. Unfortunately, they are often impractical because of slow processing, lack of support for some variants of FASTQ files or instability. We propose DSRC 2 that offers compression ratios comparable with the best existing solutions, while being a few times faster and more flexible.
引用
收藏
页码:2213 / 2215
页数:3
相关论文
共 34 条
  • [1] Compression of Nanopore FASTQ Files
    Dufort y Alvarez, Guillermo
    Seroussi, Gadiel
    Smircich, Pablo
    Sotelo, Jose
    Ochoa, Idoia
    Martin, Alvaro
    BIOINFORMATICS AND BIOMEDICAL ENGINEERING, IWBBIO 2019, PT I, 2019, 11465 : 36 - 47
  • [2] Efficient algorithms for the compression of FASTQ files
    Saha, Subrata
    Rajasekaran, Sanguthevar
    2014 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2014,
  • [3] A new efficient referential genome compression technique for FastQ files
    Kumar, Sanjeev
    Singh, Mukund Pratap
    Nayak, Soumya Ranjan
    Khan, Asif Uddin
    Jain, Anuj Kumar
    Singh, Prabhishek
    Diwakar, Manoj
    Soujanya, Thota
    FUNCTIONAL & INTEGRATIVE GENOMICS, 2023, 23 (04)
  • [4] A new efficient referential genome compression technique for FastQ files
    Sanjeev Kumar
    Mukund Pratap Singh
    Soumya Ranjan Nayak
    Asif Uddin Khan
    Anuj Kumar Jain
    Prabhishek Singh
    Manoj Diwakar
    Thota Soujanya
    Functional & Integrative Genomics, 2023, 23
  • [5] LW-FQZip 2: a parallelized reference-based compression of FASTQ files
    Zhi-An Huang
    Zhenkun Wen
    Qingjin Deng
    Ying Chu
    Yiwen Sun
    Zexuan Zhu
    BMC Bioinformatics, 18
  • [6] LW-FQZip 2: a parallelized reference-based compression of FASTQ files
    Huang, Zhi-An
    Wen, Zhenkun
    Deng, Qingjin
    Chu, Ying
    Sun, Yiwen
    Zhu, Zexuan
    BMC BIOINFORMATICS, 2017, 18
  • [7] GTZ: a fast compression and cloud transmission tool optimized for FASTQ files
    Xing, Yuting
    Li, Gen
    Wang, Zhenguo
    Feng, Bolun
    Song, Zhuo
    Wu, Chengkun
    BMC BIOINFORMATICS, 2017, 18
  • [8] RETRACTED: LFQC: a lossless compression algorithm for FASTQ files (Retracted Article)
    Nicolae, Marius
    Pathak, Sudipta
    Rajasekaran, Sanguthevar
    BIOINFORMATICS, 2015, 31 (20) : 3276 - 3281
  • [9] GTZ: a fast compression and cloud transmission tool optimized for FASTQ files
    Yuting Xing
    Gen Li
    Zhenguo Wang
    Bolun Feng
    Zhuo Song
    Chengkun Wu
    BMC Bioinformatics, 18
  • [10] Lossless and reference-free compression of FASTQ/A files using GeneSqueeze
    Nazari, Foad
    Patel, Sneh
    Larocca, Melissa
    Sansevich, Alina
    Czarny, Ryan
    Schena, Giana
    Murray, Emma K.
    SCIENTIFIC REPORTS, 2025, 15 (01):