Compression of Next-Generation Sequencing Data and of DNA Digital Files

被引:3
|
作者
Carpentieri, Bruno [1 ]
机构
[1] Univ Salerno, Dipartimento Informat, Via Giovanni Paolo II 132, I-84084 Fisciano, SA, Italy
关键词
data compression; Next-Generation Sequencing data; DNA; genomes; GENOMIC DATA;
D O I
10.3390/a13060151
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The increase in memory and in network traffic used and caused by new sequenced biological data has recently deeply grown. Genomic projects such as HapMap and 1000 Genomes have contributed to the very large rise of databases and network traffic related to genomic data and to the development of new efficient technologies. The large-scale sequencing of samples of DNA has brought new attention and produced new research, and thus the interest in the scientific community for genomic data has greatly increased. In a very short time, researchers have developed hardware tools, analysis software, algorithms, private databases, and infrastructures to support the research in genomics. In this paper, we analyze different approaches for compressing digital files generated by Next-Generation Sequencing tools containing nucleotide sequences, and we discuss and evaluate the compression performance of generic compression algorithms by confronting them with a specific system designed by Jones et al. specifically for genomic file compression:Quip. Moreover, we present a simple but effective technique for the compression of DNA sequences in which we only consider the relevant DNA data and experimentally evaluate its performances.
引用
收藏
页数:11
相关论文
共 50 条
  • [41] SeqSQC: A Bioconductor Package for Evaluating the Sample Quality of Next-generation Sequencing Data
    Liu, Qian
    Hu, Qiang
    Yao, Song
    Kwan, Marilyn L.
    Roh, Janise M.
    Zhao, Hua
    Ambrosone, Christine B.
    Kushi, Lawrence H.
    Liu, Song
    Zhu, Qianqian
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2019, 17 (02) : 211 - 218
  • [42] SPARSE SIGNAL RECOVERY METHODS FOR VARIANT DETECTION IN NEXT-GENERATION SEQUENCING DATA
    Banuelos, Mario
    Almanza, Rubi
    Adhikari, Lasith
    Sindi, Suzanne
    Marcia, Roummel F.
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 864 - 868
  • [43] QACtools: A Quality Assessment and Quality Control Tool for Next-Generation Sequencing Data
    Song, Dandan
    Li, Ning
    Liao, Lejian
    PROCEEDINGS OF THE 2015 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT TECHNOLOGY AND SYSTEMS, 2015, 338 : 463 - 470
  • [44] Human Leukocyte Antigen Typing by Next-Generation Sequencing
    Profaizer, Tracie
    Kumanovics, Attila
    CLINICS IN LABORATORY MEDICINE, 2018, 38 (04) : 565 - +
  • [45] Clinical applications of next-generation sequencing in histocompatibility and transplantation
    Lan, James H.
    Zhang, Qiuheng
    CURRENT OPINION IN ORGAN TRANSPLANTATION, 2015, 20 (04) : 461 - 467
  • [46] Clinical value of macrogenome next-generation sequencing on infections
    Han, Benfa
    Zhang, Xiaoli
    Li, Xiuxi
    Chen, Mei
    Ma, Yanlin
    Zhang, Yunxia
    Huo, Song
    OPEN LIFE SCIENCES, 2024, 19 (01):
  • [47] SMITH: a LIMS for handling next-generation sequencing workflows
    Francesco Venco
    Yuriy Vaskin
    Arnaud Ceol
    Heiko Muller
    BMC Bioinformatics, 15
  • [48] Next-generation sequencing and its applications in molecular diagnostics
    Su, Zhenqiang
    Ning, Baitang
    Fang, Hong
    Hong, Huixiao
    Perkins, Roger
    Tong, Weida
    Shi, Leming
    EXPERT REVIEW OF MOLECULAR DIAGNOSTICS, 2011, 11 (03) : 333 - 343
  • [49] Base-calling for next-generation sequencing platforms
    Ledergerber, Christian
    Dessimoz, Christophe
    BRIEFINGS IN BIOINFORMATICS, 2011, 12 (05) : 489 - 497
  • [50] Targeted next-generation sequencing in Slovak cardiomyopathy patients
    Nagyova, E.
    Radvanszky, J.
    Hyblova, M.
    Simovicova, V
    Goncalvesova, E.
    Asselbergs, F. W.
    Kadasi, L.
    Szemes, T.
    Minarik, G.
    BRATISLAVA MEDICAL JOURNAL-BRATISLAVSKE LEKARSKE LISTY, 2019, 120 (01): : 46 - 51