A Distributed System for Fast Alignment of Next-Generation Sequencing Data

被引:0
|
作者
Srimani, Jaydeep K. [1 ]
Wu, Po-Yen [1 ]
Phan, John H. [2 ,3 ]
Wang, May D. [2 ,3 ]
机构
[1] Georgia Inst Technol, Dept Elect & Comp Engn, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Wallace H Coulter Biomed Engn Dept, Atlanta, GA 30332 USA
[3] Emory Univ, Atlanta, GA 30332 USA
基金
美国国家卫生研究院;
关键词
BOINC; distributed computing; next-generation sequencing; gene expression analysis; READ ALIGNMENT;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We developed a scalable distributed computing system using the Berkeley Open Interface for Network Computing (BOINC) to align next-generation sequencing (NGS) data quickly and accurately. NGS technology is emerging as a promising platform for gene expression analysis due to its high sensitivity compared to traditional genomic microarray technology. However, despite the benefits, NGS datasets can be prohibitively large, requiring significant computing resources to obtain sequence alignment results. Moreover, as the data and alignment algorithms become more prevalent, it will become necessary to examine the elTect of the multitude of alignment parameters on various NGS systems. We validate the distributed software system by (1) computing simple timing results to show the speed-up gained by using multiple computers, (2) optimizing alignment parameters using simulated NGS data, and (3) computing NGS expression levels for a single biological sample using optimal parameters and comparing these expression levels to that of a microarray sample. Results indicate that the distributed alignment system achieves approximately a linear speed-up and correctly distributes sequence data to and gathers alignment results from multiple compute clients.
引用
收藏
页码:579 / 584
页数:6
相关论文
共 50 条
  • [1] Qualimap: evaluating next-generation sequencing alignment data
    Garcia-Alcalde, Fernando
    Okonechnikov, Konstantin
    Carbonell, Jose
    Cruz, Luis M.
    Goetz, Stefan
    Tarazona, Sonia
    Dopazo, Joaquin
    Meyer, Thomas F.
    Conesa, Ana
    BIOINFORMATICS, 2012, 28 (20) : 2678 - 2679
  • [2] Alignment of Next-Generation Sequencing Reads
    Reinert, Knut
    Langmead, Ben
    Weese, David
    Evers, Dirk J.
    ANNUAL REVIEW OF GENOMICS AND HUMAN GENETICS, VOL 16, 2015, 16 : 133 - 151
  • [3] Review of alignment and SNP calling algorithms for next-generation sequencing data
    Mielczarek, M.
    Szyda, J.
    JOURNAL OF APPLIED GENETICS, 2016, 57 (01) : 71 - 79
  • [4] Review of alignment and SNP calling algorithms for next-generation sequencing data
    M. Mielczarek
    J. Szyda
    Journal of Applied Genetics, 2016, 57 : 71 - 79
  • [5] NGSNGS: next-generation simulator for next-generation sequencing data
    Henriksen, Rasmus Amund
    Zhao, Lei
    Korneliussen, Thorfinn Sand
    BIOINFORMATICS, 2023, 39 (01)
  • [6] Advancing next-generation sequencing data analytics with scalable distributed infrastructure
    Kim, Joohyun
    Maddineni, Sharath
    Jha, Shantenu
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2014, 26 (04): : 894 - 906
  • [7] A fast and accurate SNP detection algorithm for next-generation sequencing data
    Xu, Feng
    Wang, Weixin
    Wang, Panwen
    Li, Mulin Jun
    Sham, Pak Chung
    Wang, Junwen
    NATURE COMMUNICATIONS, 2012, 3
  • [8] A fast and accurate SNP detection algorithm for next-generation sequencing data
    Feng Xu
    Weixin Wang
    Panwen Wang
    Mulin Jun Li
    Pak Chung Sham
    Junwen Wang
    Nature Communications, 3
  • [9] Next-Generation Sequencing Data Analysis
    Chowdhry, Amit K.
    JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 2024,
  • [10] Indexing Next-Generation Sequencing data
    Jalili, Vahid
    Matteucci, Matteo
    Masseroli, Marco
    Ceri, Stefano
    INFORMATION SCIENCES, 2017, 384 : 90 - 109