Copynumber: Efficient algorithms for single- and multi-track copy number segmentation

被引:207
作者
Nilsen, Gro [1 ,2 ]
Liestol, Knut [1 ,2 ]
Van Loo, Peter [3 ,4 ,5 ]
Vollan, Hans Kristian Moen [6 ,7 ,8 ]
Eide, Marianne B. [2 ,9 ]
Rueda, Oscar M. [10 ,11 ]
Chin, Suet-Feung [10 ,11 ]
Russell, Roslin [10 ,11 ]
Baumbusch, Lars O. [6 ]
Caldas, Carlos [10 ,11 ,12 ,13 ]
Borresen-Dale, Anne-Lise [6 ,7 ]
Lingjaerde, Ole Christian [1 ,2 ,6 ]
机构
[1] Univ Oslo, Dept Informat, N-0316 Oslo, Norway
[2] Univ Oslo, Ctr Canc Biomed, Oslo, Norway
[3] Wellcome Trust Sanger Inst, Canc Genome Project, Cambridge, England
[4] VIB, Dept Human Genet, Louvain, Belgium
[5] Univ Louvain, Louvain, Belgium
[6] Norwegian Radium Hosp, Inst Canc Res, Oslo Univ Hosp, Dept Genet, Oslo, Norway
[7] Univ Oslo, Fac Med, Inst Clin Med, Oslo, Norway
[8] Norwegian Radium Hosp, Oslo Univ Hosp, Div Canc Surg & Transplantat, Dept Oncol, Oslo, Norway
[9] Norwegian Radium Hosp, Inst Canc Res, Oslo Univ Hosp, Dept Immunol, Oslo, Norway
[10] Univ Cambridge, Li Ka Shing Ctr, Canc Res UK Cambridge Res Inst, Cambridge, England
[11] Univ Cambridge, Dept Oncol, Cambridge, England
[12] Addenbrookes Hosp, Cambridge Breast Unit, Cambridge, England
[13] Cambridge Univ Hosp NHS Fdn Trust, Cambridge Natl Inst Hlth Res, Biomed Res Ctr, Cambridge, England
关键词
Copy number; aCGH; Segmentation; Allele-specific segmentation; Penalized regression; Least squares; Bioconductor; ARRAY CGH DATA; CIRCULAR BINARY SEGMENTATION; IDENTIFICATION; EVOLUTION; CANCER;
D O I
10.1186/1471-2164-13-591
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: Cancer progression is associated with genomic instability and an accumulation of gains and losses of DNA. The growing variety of tools for measuring genomic copy numbers, including various types of array-CGH, SNP arrays and high-throughput sequencing, calls for a coherent framework offering unified and consistent handling of single- and multi-track segmentation problems. In addition, there is a demand for highly computationally efficient segmentation algorithms, due to the emergence of very high density scans of copy number. Results: A comprehensive Bioconductor package for copy number analysis is presented. The package offers a unified framework for single sample, multi-sample and multi-track segmentation and is based on statistically sound penalized least squares principles. Conditional on the number of breakpoints, the estimates are optimal in the least squares sense. A novel and computationally highly efficient algorithm is proposed that utilizes vector-based operations in R. Three case studies are presented. Conclusions: The R package copynumber is a software suite for segmentation of single- and multi-track copy number data using algorithms based on coherent least squares principles.
引用
收藏
页数:16
相关论文
共 37 条
[1]   High-resolution characterization of the pancreatic adenocarcinoma genome [J].
Aguirre, AJ ;
Brennan, C ;
Bailey, G ;
Sinha, R ;
Feng, B ;
Leo, C ;
Zhang, YY ;
Zhang, J ;
Gans, JD ;
Bardeesy, N ;
Cauwels, C ;
Cordon-Cardo, C ;
Redston, MS ;
DePinho, RA ;
Chin, L .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2004, 101 (24) :9067-9072
[2]   A fast and flexible method for the segmentation of aCGH data [J].
Ben-Yaacov, Erez ;
Eldar, Yonina C. .
BIOINFORMATICS, 2008, 24 (16) :I139-I145
[3]   Assessing the significance of chromosomal aberrations in cancer: Methodology and application to glioma [J].
Beroukhim, Rameen ;
Getz, Gad ;
Nghiemphu, Leia ;
Barretina, Jordi ;
Hsueh, Teli ;
Linhart, David ;
Vivanco, Igor ;
Lee, Jeffrey C. ;
Huang, Julie H. ;
Alexander, Sethu ;
Du, Jinyan ;
Kau, Tweeny ;
Thomas, Roman K. ;
Shah, Kinial ;
Soto, Horacio ;
Perner, Sven ;
Prensner, John ;
Debiasi, Ralph M. ;
Demichelis, Francesca ;
Hatton, Charlie ;
Rubin, Mark A. ;
Garraway, Levi A. ;
Nelson, Stan F. ;
Liau, Linda ;
Mischel, Paul S. ;
Cloughesy, Tim F. ;
Meyerson, Matthew ;
Golub, Todd A. ;
Lander, Eric S. ;
Mellinghoff, Ingo K. ;
Sellers, William R. .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2007, 104 (50) :20007-20012
[4]   The landscape of somatic copy-number alteration across human cancers [J].
Beroukhim, Rameen ;
Mermel, Craig H. ;
Porter, Dale ;
Wei, Guo ;
Raychaudhuri, Soumya ;
Donovan, Jerry ;
Barretina, Jordi ;
Boehm, Jesse S. ;
Dobson, Jennifer ;
Urashima, Mitsuyoshi ;
Mc Henry, Kevin T. ;
Pinchback, Reid M. ;
Ligon, Azra H. ;
Cho, Yoon-Jae ;
Haery, Leila ;
Greulich, Heidi ;
Reich, Michael ;
Winckler, Wendy ;
Lawrence, Michael S. ;
Weir, Barbara A. ;
Tanaka, Kumiko E. ;
Chiang, Derek Y. ;
Bass, Adam J. ;
Loo, Alice ;
Hoffman, Carter ;
Prensner, John ;
Liefeld, Ted ;
Gao, Qing ;
Yecies, Derek ;
Signoretti, Sabina ;
Maher, Elizabeth ;
Kaye, Frederic J. ;
Sasaki, Hidefumi ;
Tepper, Joel E. ;
Fletcher, Jonathan A. ;
Tabernero, Josep ;
Baselga, Jose ;
Tsao, Ming-Sound ;
Demichelis, Francesca ;
Rubin, Mark A. ;
Janne, Pasi A. ;
Daly, Mark J. ;
Nucera, Carmelo ;
Levine, Ross L. ;
Ebert, Benjamin L. ;
Gabriel, Stacey ;
Rustgi, Anil K. ;
Antonescu, Cristina R. ;
Ladanyi, Marc ;
Letai, Anthony .
NATURE, 2010, 463 (7283) :899-905
[5]   An all-statistics, high-speed algorithm for the analysis of copy number variation in genomes [J].
Chen, Chih-Hao ;
Lee, Hsing-Chung ;
Ling, Qingdong ;
Chen, Hsiao-Rong ;
Ko, Yi-An ;
Tsou, Tsong-Shan ;
Wang, Sun-Chong ;
Wu, Li-Ching ;
Lee, H. C. .
NUCLEIC ACIDS RESEARCH, 2011, 39 (13) :e89
[6]   FACADE: a fast and sensitive algorithm for the segmentation and calling of high resolution array CGH data [J].
Coe, Bradley P. ;
Chari, Raj ;
MacAulay, Calum ;
Lam, Wan L. .
NUCLEIC ACIDS RESEARCH, 2010, 38 (15) :e157-e157
[7]   Genomic alterations reveal potential for higher grade transformation in follicular lymphoma and confirm parallel evolution of tumor cell clones [J].
Eide, Marianne Brodtkorb ;
Liestol, Knut ;
Lingjaerde, Ole Christian ;
Hystad, Marit E. ;
Kresse, Stine H. ;
Meza-Zepeda, Leonardo ;
Myklebost, Ola ;
Troen, Gunhild ;
Aamot, Hege Vangstein ;
Holte, Harald ;
Smeland, Erlend Bremertun ;
Delabie, Jan .
BLOOD, 2010, 116 (09) :1489-1497
[8]   Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing [J].
Gerlinger, Marco ;
Rowan, Andrew J. ;
Horswell, Stuart ;
Larkin, James ;
Endesfelder, David ;
Gronroos, Eva ;
Martinez, Pierre ;
Matthews, Nicholas ;
Stewart, Aengus ;
Tarpey, Patrick ;
Varela, Ignacio ;
Phillimore, Benjamin ;
Begum, Sharmin ;
McDonald, Neil Q. ;
Butler, Adam ;
Jones, David ;
Raine, Keiran ;
Latimer, Calli ;
Santos, Claudio R. ;
Nohadani, Mahrokh ;
Eklund, Aron C. ;
Spencer-Dene, Bradley ;
Clark, Graham ;
Pickering, Lisa ;
Stamp, Gordon ;
Gore, Martin ;
Szallasi, Zoltan ;
Downward, Julian ;
Futreal, P. Andrew ;
Swanton, Charles .
NEW ENGLAND JOURNAL OF MEDICINE, 2012, 366 (10) :883-892
[9]   Analysis of array CGH data:: from signal ratio to gain and loss of DNA regions [J].
Hupé, P ;
Stransky, N ;
Thiery, JP ;
Radvanyi, F ;
Barillot, E .
BIOINFORMATICS, 2004, 20 (18) :3413-3422
[10]   Identification of cancer genes using a statistical framework for multiexperiment analysis of nondiscretized array CGH data [J].
Klijn, Christiaan ;
Holstege, Henne ;
de Ridder, Jeroen ;
Liu, Xiaoling ;
Reinders, Marcel ;
Jonkers, Jos ;
Wessels, Lodewyk .
NUCLEIC ACIDS RESEARCH, 2008, 36 (02)