Fast and Parallel Algorithm for Population-Based Segmentation of Copy-Number Profiles

被引:0
作者
Rigaill, Guillem [1 ]
Miele, Vincent [2 ]
Picard, Franck [2 ]
机构
[1] Univ Evry Val dEssonne, Unit Rech Genom Vegetale URGV, INRA CNRS, F-91057 Evry, France
[2] Univ Lyon 1, Lab Biometrie & Biol Evolut, UMR CNRS 5558, F-69622 Villeurbanne, France
来源
COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS: 10TH INTERNATIONAL MEETING | 2014年 / 8452卷
关键词
DNA copy number; Dynamic Programming; Segmentation; Joint segmentation; Parallel computing; ARRAY CGH DATA; MODEL;
D O I
10.1007/978-3-319-09042-9_18
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Dynamic Programming (DP) based change-point methods have shown very good statistical performance on DNA copy number analysis. However, the quadratic algorithmic complexity of DP has limited their use on high-density arrays or next generation sequencing data. This complexity issue is particularly critical for segmentation and calling of segments, and for the joint segmentation of many different profiles. Our contribution is two-fold. First we provide an at worst linear DP algorithm for segmentation and calling, which allows the use of DP-based segmentation on high-density arrays with a considerably reduced computational cost. For the joint segmentation issue we provide a parallel version of the cghseg package which now allows us to analyze more than 1,000 profiles of length 100,000 within a few hours. Therefore our method and software package are adapted to the next generation of computers (multi-cores) and experiments (very large profiles).
引用
收藏
页码:248 / 258
页数:11
相关论文
共 20 条
[1]  
Amdahl G.M., 1967, Validity of the single processor approach to achieving large scale computing capabilities. Proceedings of the Spring Joint Computer Conference, P483, DOI DOI 10.1145/1465482.1465560
[2]   A high-resolution map of transcription in the yeast genome [J].
David, L ;
Huber, W ;
Granovskaia, M ;
Toedling, J ;
Palm, CJ ;
Bofkin, L ;
Jones, T ;
Davis, RW ;
Steinmetz, LM .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2006, 103 (14) :5320-5325
[3]  
Hocking T.D., 2012, 00663790 HAL
[4]  
Killick R, 2011, ARXIV11011438
[5]   BioHMM:: a heterogeneous hidden Markov model for segmenting array CGH data [J].
Marioni, JC ;
Thorne, NP ;
Tavaré, S .
BIOINFORMATICS, 2006, 22 (09) :1144-1146
[6]   Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models [J].
Nicolas, P ;
Bize, L ;
Muri, F ;
Hoebeke, M ;
Rodolphe, F ;
Ehrlich, SD ;
Prum, B ;
Bessières, P .
NUCLEIC ACIDS RESEARCH, 2002, 30 (06) :1418-1426
[7]   Transcriptional landscape estimation from tiling array data using a model of signal shift and drift [J].
Nicolas, Pierre ;
Leduc, Aurelie ;
Robin, Stephane ;
Rasmussen, Simon ;
Jarmer, Hanne ;
Bessieres, Philippe .
BIOINFORMATICS, 2009, 25 (18) :2341-2347
[8]   Circular binary segmentation for the analysis of array-based DNA copy number data [J].
Olshen, AB ;
Venkatraman, ES ;
Lucito, R ;
Wigler, M .
BIOSTATISTICS, 2004, 5 (04) :557-572
[9]   A Segmentation/Clustering model for the analysis of array CGH data [J].
Picard, F. ;
Robin, S. ;
Lebarbier, E. ;
Daudin, J.-J. .
BIOMETRICS, 2007, 63 (03) :758-766
[10]   A statistical approach for array CGH data analysis [J].
Picard, F ;
Robin, S ;
Lavielle, M ;
Vaisse, C ;
Daudin, JJ .
BMC BIOINFORMATICS, 2005, 6 (1)