A robust framework for detecting structural variations in a genome

被引:42
作者
Lee, Seunghak [1 ]
Cheran, Elango [1 ]
Brudno, Michael [1 ,2 ,3 ]
机构
[1] Univ Toronto, Dept Comp Sci, Toronto, ON M5S 3G4, Canada
[2] Univ Toronto, Banting & Best Dept Med Res, Toronto, ON M5S 3G4, Canada
[3] Univ Toronto, Ctr Anal Genome Evolut & Funct, Toronto, ON M5S 3G4, Canada
关键词
D O I
10.1093/bioinformatics/btn176
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
Motivation: Recently, structural genomic variants have come to the forefront as a significant source of variation in the human population, but the identification of these variants in a large genome remains a challenge. The complete sequencing of a human individual is prohibitive at current costs, while current polymorphism detection technologies, such as SNP arrays, are not able to identify many of the large scale events. One of the most promising methods to detect such variants is the computational mapping of clone-end sequences to a reference genome. Results: Here, we present a probabilistic framework for the identification of structural variants using clone-end sequencing. Unlike previous methods, our approach does not rely on an a priori determined mapping of all reads to the reference. Instead, we build a framework for finding the most probable assignment of sequenced clones to potential structural variants based on the other clones. We compare our predictions with the structural variants identified in three previous studies. While there is a statistically significant correlation between the predictions, we also find a significant number of previously uncharacterized structural variants. Furthermore, we identify a number of putative cross-chromosomal events, primarily located proximally to the centromeres of the chromosomes.
引用
收藏
页码:I59 / I67
页数:9
相关论文
共 19 条
[1]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[2]   Jumping translocations [J].
Berger, Roland ;
Bernard, Olivier A. .
GENES CHROMOSOMES & CANCER, 2007, 46 (08) :717-723
[3]   Inducing features of random fields [J].
DellaPietra, S ;
DellaPietra, V ;
Lafferty, J .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1997, 19 (04) :380-393
[4]   Structural variation in the human genome [J].
Feuk, L ;
Carson, AR ;
Scherer, SW .
NATURE REVIEWS GENETICS, 2006, 7 (02) :85-97
[5]   A METHOD FOR COMPARING 2 HIERARCHICAL CLUSTERINGS [J].
FOWLKES, EB ;
MALLOWS, CL .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1983, 78 (383) :553-569
[6]   A second generation human haplotype map of over 3.1 million SNPs [J].
Frazer, Kelly A. ;
Ballinger, Dennis G. ;
Cox, David R. ;
Hinds, David A. ;
Stuve, Laura L. ;
Gibbs, Richard A. ;
Belmont, John W. ;
Boudreau, Andrew ;
Hardenbol, Paul ;
Leal, Suzanne M. ;
Pasternak, Shiran ;
Wheeler, David A. ;
Willis, Thomas D. ;
Yu, Fuli ;
Yang, Huanming ;
Zeng, Changqing ;
Gao, Yang ;
Hu, Haoran ;
Hu, Weitao ;
Li, Chaohua ;
Lin, Wei ;
Liu, Siqi ;
Pan, Hao ;
Tang, Xiaoli ;
Wang, Jian ;
Wang, Wei ;
Yu, Jun ;
Zhang, Bo ;
Zhang, Qingrun ;
Zhao, Hongbin ;
Zhao, Hui ;
Zhou, Jun ;
Gabriel, Stacey B. ;
Barry, Rachel ;
Blumenstiel, Brendan ;
Camargo, Amy ;
Defelice, Matthew ;
Faggart, Maura ;
Goyette, Mary ;
Gupta, Supriya ;
Moore, Jamie ;
Nguyen, Huy ;
Onofrio, Robert C. ;
Parkin, Melissa ;
Roy, Jessica ;
Stahl, Erich ;
Winchester, Ellen ;
Ziaugra, Liuda ;
Altshuler, David ;
Shen, Yan .
NATURE, 2007, 449 (7164) :851-U3
[7]   Detection of large-scale variation in the human genome [J].
Iafrate, AJ ;
Feuk, L ;
Rivera, MN ;
Listewnik, ML ;
Donahoe, PK ;
Qi, Y ;
Scherer, SW ;
Lee, C .
NATURE GENETICS, 2004, 36 (09) :949-951
[8]   Sequences flanking the centromere of human chromosome 10 are a complex patchwork of arm-specific sequences, stable duplications and unstable sequences with homologies to telomeric and other centromeric locations [J].
Jackson, MS ;
Rocchi, M ;
Thompson, G ;
Hearn, T ;
Crosier, M ;
Guy, J ;
Kirk, D ;
Mulligan, L ;
Ricco, A ;
Piccininni, S ;
Marzella, R ;
Viggiano, L ;
Archidiacono, N .
HUMAN MOLECULAR GENETICS, 1999, 8 (02) :205-215
[9]   COMPARATIVE GENOMIC HYBRIDIZATION FOR MOLECULAR CYTOGENETIC ANALYSIS OF SOLID TUMORS [J].
KALLIONIEMI, A ;
KALLIONIEMI, OP ;
SUDAR, D ;
RUTOVITZ, D ;
GRAY, JW ;
WALDMAN, F ;
PINKEL, D .
SCIENCE, 1992, 258 (5083) :818-821
[10]   The human genome browser at UCSC [J].
Kent, WJ ;
Sugnet, CW ;
Furey, TS ;
Roskin, KM ;
Pringle, TH ;
Zahler, AM ;
Haussler, D .
GENOME RESEARCH, 2002, 12 (06) :996-1006