Reconstructing DNA copy number by joint segmentation of multiple sequences

被引:11
|
作者
Zhang, Zhongyang [1 ]
Lange, Kenneth [2 ]
Sabatti, Chiara [3 ]
机构
[1] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA USA
[2] Univ Calif Los Angeles, Dept Human Genet Biomath & Stat, Los Angeles, CA USA
[3] Stanford Univ, Dept Hlth Res & Policy & Stat, Stanford, CA 94305 USA
来源
BMC BIOINFORMATICS | 2012年 / 13卷
关键词
Copy number variant; Copy number polymorphism; Fused lasso; Group fused lasso; MM algorithm; CIRCULAR BINARY SEGMENTATION; HIDDEN MARKOV-MODELS; GENOTYPE CALLS; LASSO; NORMALIZATION; ALGORITHMS; SELECTION; PACKAGE; PATH;
D O I
10.1186/1471-2105-13-205
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Background: Variations in DNA copy number carry information on the modalities of genome evolution and mis-regulation of DNA replication in cancer cells. Their study can help localize tumor suppressor genes, distinguish different populations of cancerous cells, and identify genomic variations responsible for disease phenotypes. A number of different high throughput technologies can be used to identify copy number variable sites, and the literature documents multiple effective algorithms. We focus here on the specific problem of detecting regions where variation in copy number is relatively common in the sample at hand. This problem encompasses the cases of copy number polymorphisms, related samples, technical replicates, and cancerous sub-populations from the same individual. Results: We present a segmentation method named generalized fused lasso (GFL) to reconstruct copy number variant regions. GFL is based on penalized estimation and is capable of processing multiple signals jointly. Our approach is computationally very attractive and leads to sensitivity and specificity levels comparable to those of state-of-the-art specialized methodologies. We illustrate its applicability with simulated and real data sets. Conclusions: The flexibility of our framework makes it applicable to data obtained with a wide range of technology. Its versatility and speed make GFL particularly useful in the initial screening stages of large data sets.
引用
收藏
页数:15
相关论文
共 50 条
  • [11] DBS: a fast and informative segmentation algorithm for DNA copy number analysis
    Jun Ruan
    Zhen Liu
    Ming Sun
    Yue Wang
    Junqiu Yue
    Guoqiang Yu
    BMC Bioinformatics, 20
  • [12] Simple binary segmentation frameworks for identifying variation in DNA copy number
    Yang, Tae Young
    BMC BIOINFORMATICS, 2012, 13
  • [13] DBS: a fast and informative segmentation algorithm for DNA copy number analysis
    Ruan, Jun
    Liu, Zhen
    Sun, Ming
    Wang, Yue
    Yue, Junqiu
    Yu, Guoqiang
    BMC BIOINFORMATICS, 2019, 20 (1)
  • [14] VARIATION AMONG ALFALFA SOMACLONES IN COPY NUMBER OF REPEATED DNA-SEQUENCES
    KIDWELL, KK
    OSBORN, TC
    GENOME, 1993, 36 (05) : 906 - 912
  • [15] ISOLATION OF LOW-COPY-NUMBER SEQUENCES THAT NEIGHBOR SATELLITE DNA IN MAMMALS
    MARESCA, A
    THAYER, RE
    GUENET, C
    SINGER, MF
    GENE, 1986, 50 (1-3) : 299 - 311
  • [16] Framework for the identification of common variations in multiple DNA copy number samples
    Alqallaf, Abdullah K.
    Tewfik, Ahmed H.
    CONFERENCE RECORD OF THE FORTY-FIRST ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, VOLS 1-5, 2007, : 39 - 43
  • [17] Circular binary segmentation for the analysis of array-based DNA copy number data
    Olshen, AB
    Venkatraman, ES
    Lucito, R
    Wigler, M
    BIOSTATISTICS, 2004, 5 (04) : 557 - 572
  • [18] Sequential Model Selection-Based Segmentation to Detect DNA Copy Number Variation
    Hu, Jianhua
    Zhang, Liwen
    Wang, Huixia Judy
    BIOMETRICS, 2016, 72 (03) : 815 - 826
  • [19] CONDEX: COpy Number Detection in EXome Sequences
    Ramachandran, Arthi
    Micsinai, Mariann
    Pe'er, Itsik
    2011 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE WORKSHOPS, 2011, : 87 - 93
  • [20] CONDEX: COpy Number Detection in EXome sequences
    Ramachandran, Arthi
    Micsinai, Mariann
    Pe'er, Itsik
    2011 IEEE International Conference on Bioinformatics and Biomedicine Workshops, BIBMW 2011, 2011, : 87 - 93