A shifting level model algorithm that identifies aberrations in array-CGH data

被引:23
作者
Magi, Alberto [1 ,2 ,3 ]
Benelli, Matteo [1 ,2 ]
Marseglia, Giuseppina [1 ,2 ]
Nannetti, Genni [1 ]
Scordo, Maria Rosaria [4 ]
Torricelli, Francesca [1 ,2 ]
机构
[1] Univ Florence, Diagnost Genet Unit, Careggi Hosp, AOUC, I-50141 Florence, Italy
[2] Univ Florence, Ctr Study Complex Dynam, I-50019 Florence, Italy
[3] Univ Florence, Dept Med & Surg Crit Care, Florence, Italy
[4] Univ Florence, Infantile Neuropsychiat Unit, Careggi Hosp, AOUC, I-50141 Florence, Italy
关键词
array-CGH; Segmentation algorithm; Shifting level model; HIDDEN MARKOV-MODELS; DNA COPY NUMBER; SEGMENTATION; TIME;
D O I
10.1093/biostatistics/kxp051
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Array comparative genomic hybridization (aCGH) is a microarray technology that allows one to detect and map genomic alterations. The goal of aCGH analysis is to identify the boundaries of the regions where the number of DNA copies changes (breakpoint identification) and then to label each region as loss, neutral, or gain (calling). In this paper, we introduce a new algorithm, based on the shifting level model (SLM), with the aim of locating regions with different means of the log(2) ratio in genomic profiles obtained from aCGH data. We combine the SLM algorithm with the CGHcall calling procedure and compare their performances with 5 state-of-the-art methods. When dealing with synthetic data, our method outperforms the other 5 algorithms in detecting the change in the number of DNA copies in the most challenging situations. For real aCGH data, SLM is able to locate all the cytogenetically mapped aberrations giving a smaller number of false-positive breakpoints than the compared methods. The application of the SLM algorithm is not limited to aCGH data. Our approach can also be used for the analysis of several emerging experimental strategies such as high-resolution tiling array.
引用
收藏
页码:265 / 280
页数:16
相关论文
共 18 条
[1]   High-resolution genome-wide mapping of genetic alterations in human glial brain tumors [J].
Bredel, M ;
Bredel, C ;
Juric, D ;
Harsh, GR ;
Vogel, H ;
Recht, LD ;
Sikic, BI .
CANCER RESEARCH, 2005, 65 (10) :4088-4096
[2]   ESTIMATING CURRENT MEAN OF NORMAL-DISTRIBUTION WHICH IS SUBJECTED TO CHANGES IN TIME [J].
CHERNOFF, H ;
ZACKS, S .
ANNALS OF MATHEMATICAL STATISTICS, 1964, 35 (03) :999-&
[3]   VITERBI ALGORITHM [J].
FORNEY, GD .
PROCEEDINGS OF THE IEEE, 1973, 61 (03) :268-278
[4]   Hidden Markov models approach to the analysis of array CGH data [J].
Fridlyand, J ;
Snijders, AM ;
Pinkel, D ;
Albertson, DG ;
Jain, AN .
JOURNAL OF MULTIVARIATE ANALYSIS, 2004, 90 (01) :132-153
[5]   Transcript mapping with high-density oligonucleotide tiling arrays [J].
Huber, Wolfgang ;
Toedling, Joern ;
Steinmetz, Lars M. .
BIOINFORMATICS, 2006, 22 (16) :1963-1970
[6]   Analysis of array CGH data:: from signal ratio to gain and loss of DNA regions [J].
Hupé, P ;
Stransky, N ;
Thiery, JP ;
Radvanyi, F ;
Barillot, E .
BIOINFORMATICS, 2004, 20 (18) :3413-3422
[7]   Time series segmentation with shifting means hidden Markov models [J].
Kehagias, Ath. ;
Fortin, V. .
NONLINEAR PROCESSES IN GEOPHYSICS, 2006, 13 (03) :339-352
[8]   Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data [J].
Lai, WR ;
Johnson, MD ;
Kucherlapati, R ;
Park, PJ .
BIOINFORMATICS, 2005, 21 (19) :3763-3770
[9]   BioHMM:: a heterogeneous hidden Markov model for segmenting array CGH data [J].
Marioni, JC ;
Thorne, NP ;
Tavaré, S .
BIOINFORMATICS, 2006, 22 (09) :1144-1146
[10]   Circular binary segmentation for the analysis of array-based DNA copy number data [J].
Olshen, AB ;
Venkatraman, ES ;
Lucito, R ;
Wigler, M .
BIOSTATISTICS, 2004, 5 (04) :557-572