Analysis of Short-read Aligners using Genome Sequence Complexity

被引:0
|
作者
Quang Tran [1 ]
Nam Sy Vo [2 ]
Hicks, Eric [1 ]
Tin Nguyen [3 ]
Vinhthuy Phan [1 ]
机构
[1] Univ Memphis, Dept Comp Sci, Memphis, TN 38152 USA
[2] Vingrp Big Data Inst, Dept Computat Biomed, Hanoi, Vietnam
[3] Univ Nevada, Dept Comp Sci & Engn, Reno, NV 89557 USA
来源
2020 12TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (IEEE KSE 2020) | 2020年
关键词
genome complexity; short-read alignment; genomic analysis;
D O I
10.1109/kse50997.2020.9287422
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Next generation sequencing technologies have the capability to provide large numbers of short reads inexpensively and accurately. Researchers have proposed many different methods to align short reads to reference genomes. Nevertheless, long repeats, which are known to be abundant in eukaryotic genomes, have caused considerable difficulty for genome assembly methods that rely on short-read alignment. Although a few researchers have studied sequence complexity of genomes in terms of repeats, none have quantitatively related such complexity to the difficulty of short read alignment and assembly. In this paper, we investigate several measures of genome sequence complexity with the goal of quantifying the difficulty of short read alignment. Using genomic data from 17 different organisms and testing against 12 state-of-the-art short-read aligners, we found a very strong correlation between the performance of virtually all of these aligners and measures of genome sequence complexity. Further, we show how these measures might be used to analyze and predict the performance of aligners, and more importantly, select the best aligners for specific genomes.
引用
收藏
页码:312 / 317
页数:6
相关论文
共 15 条
  • [1] parSRA: A framework for the parallel execution of short read aligners on compute clusters
    Gonzalez-Dominguez, Jorge
    Hundt, Christian
    Schmidt, Bertil
    JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 25 : 134 - 139
  • [2] CUSHAW2-GPU: Empowering Faster Gapped Short-Read Alignment Using GPU Computing
    Liu, Yongchao
    Schmidt, Bertil
    IEEE DESIGN & TEST, 2014, 31 (01) : 31 - 39
  • [3] Using GPUs for the Exact Alignment of Short-Read Genetic Sequences by Means of the Burrows-Wheeler Transform
    Salavert Torres, Jose
    Blanquer Espert, Ignacio
    Tomas Dominguez, Andres
    Hernamdez Garcia, Vicente
    Medina Castello, Ignacio
    Tarraga Gimenez, Joaquin
    Dopazo Blazquez, Joaquin
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (04) : 1245 - 1256
  • [4] Hardware-Acceleration of Short-Read Alignment Based on the Burrows-Wheeler Transform
    Waidyasooriya, Hasitha Muthumala
    Hariyama, Masanori
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2016, 27 (05) : 1358 - 1372
  • [5] Optimizing Data Parallelism for FM-Based Short-Read Alignment on the Heterogeneous Non-Uniform Memory Access Architectures
    Chen, Shaolong
    Dai, Yunzi
    Liu, Liwei
    Yu, Xinting
    FUTURE INTERNET, 2024, 16 (06)
  • [6] Evolution of Methods for NGS Short Read Alignment and Analysis of the NGS Sequences for Medical Applications
    Rexie, J. A. M.
    Raimond, Kumudha
    COMPUTER AIDED INTERVENTION AND DIAGNOSTICS IN CLINICAL AND MEDICAL IMAGES, 2019, 31 : 135 - 142
  • [7] Sequence analysis of the complete proviral genome or reticuloendotheliosis virus APC strain
    Barbosa, T.
    Zavala, G.
    Cheng, S.
    Villegas, P.
    POULTRY SCIENCE, 2006, 85 : 175 - 176
  • [8] Analysis of a new phage, KZag1, infecting biofilm of Klebsiella pneumoniae: genome sequence and characterization
    Saqr, Ebtsam
    Sadik, Mahmoud W.
    El-Didamony, Gamal
    Askora, Ahmed
    BMC MICROBIOLOGY, 2024, 24 (01):
  • [9] Cronobacter, the emergent bacterial pathogen Enterobacter sakazakii comes of age; MLST and whole genome sequence analysis
    Stephen J Forsythe
    Benjamin Dickins
    Keith A Jolley
    BMC Genomics, 15
  • [10] Cronobacter, the emergent bacterial pathogen Enterobacter sakazakii comes of age; MLST and whole genome sequence analysis
    Forsythe, Stephen J.
    Dickins, Benjamin
    Jolley, Keith A.
    BMC GENOMICS, 2014, 15