A machine learning approach for accelerating DNA sequence analysis

被引:4
作者
Memeti, Suejb [1 ]
Pllana, Sabri [1 ]
机构
[1] Linnaeus Univ, Dept Comp Sci, S-35195 Vaxjo, Sweden
关键词
DNA sequence analysis; machine learning; heterogeneous parallel computing;
D O I
10.1177/1094342016654214
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The DNA sequence analysis is a data and computationally intensive problem and therefore demands suitable parallel computing resources and algorithms. In this paper, we describe an optimized approach for DNA sequence analysis on a heterogeneous platform that is accelerated with the Intel Xeon Phi. Such platforms commonly comprise one or two general purpose host central processing units (CPUs) and one or more Xeon Phi devices. We present a parallel algorithm that shares the work of DNA sequence analysis between the host CPUs and the Xeon Phi device to reduce the overall analysis time. For automatic worksharing we use a supervised machine learning approach, which predicts the performance of DNA sequence analysis on the host and device and accordingly maps fractions of the DNA sequence to the host and device. We evaluate our approach empirically using real-world DNA segments for human and various animals on a heterogeneous platform that comprises two 12-core Intel Xeon E5 CPUs and an Intel Xeon Phi 7120P device with 61 cores.
引用
收藏
页码:363 / 379
页数:17
相关论文
共 29 条
[1]  
[Anonymous], 1979, Introduction to Automata Theory, Languages, and Computation
[2]  
[Anonymous], 14 ANN POSTGR S CONV
[3]  
[Anonymous], 2013, PICTURING PLACE PHOT
[4]   n-step FM-Index for faster pattern matching [J].
Chacon, Alejandro ;
Carlos Moure, Juan ;
Espinosa, Antonio ;
Hernandez, Porfidio .
2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 :70-79
[5]  
Chi-Keung Luk, 2009, Proceedings of the 2009 42nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2009), P45
[6]  
Chrysos G, 2014, INTELXEON PHI COPROC
[7]   A vision for the future of genomics research [J].
Collins, FS ;
Green, ED ;
Guttmacher, AE ;
Guyer, MS .
NATURE, 2003, 422 (6934) :835-847
[8]  
Friedl JEF, 2002, MASTERING REGULAR EX
[9]   Parallelizing and optimizing a hybrid differential evolution with Pareto tournaments for discovering motifs in DNA sequences [J].
Gonzalez-Alvarez, David L. ;
Vega-Rodriguez, Miguel A. ;
Rubio-Largo, Alvaro .
JOURNAL OF SUPERCOMPUTING, 2014, 70 (02) :880-905
[10]  
Gonzalez-alvarez DL, 2013, P 15 ANN C COMP GEN, P1571