Recent segmental and gene duplications in the mouse genome

被引:65
作者
Cheung, J
Wilson, MD
Zhang, JJ
Khaja, R
MacDonald, JR
Heng, HHQ
Koop, BF
Scherer, SW
机构
[1] Hosp Sick Children, Res Inst, Program Genet & Genom Biol, Toronto, ON M5G 1X8, Canada
[2] Univ Toronto, Dept Mol & Med Genet, Toronto, ON M5G 1X8, Canada
[3] Univ Victoria, Dept Biol, Ctr Biomed Res, Victoria, BC V8W 3N5, Canada
[4] Wayne State Univ, Sch Med, Detroit, MI 48202 USA
基金
加拿大健康研究院;
关键词
D O I
10.1186/gb-2003-4-8-r47
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background: The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (greater than or equal to 5 kb) and recent (greater than or equal to 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies. Results: We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice. Conclusion: Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis.
引用
收藏
页数:12
相关论文
共 44 条
[1]   Recent segmental duplications in the human genome [J].
Bailey, JA ;
Gu, ZP ;
Clark, RA ;
Reinert, K ;
Samonte, RV ;
Schwartz, S ;
Adams, MD ;
Myers, EW ;
Li, PW ;
Eichler, EE .
SCIENCE, 2002, 297 (5583) :1003-1007
[2]   Segmental duplications: Organization and impact within the current Human Genome Project assembly [J].
Bailey, JA ;
Yavor, AM ;
Massa, HF ;
Trask, BJ ;
Eichler, EE .
GENOME RESEARCH, 2001, 11 (06) :1005-1017
[3]   Deficiencies of human complement component C4A and C4B and heterozygosity in length variants of RP-C4-CYP21-TNX (RCCX) modules in Caucasians:: The load of RCCX genetic diversity on major histocompatibility complex-associated disease [J].
Blanchong, CA ;
Zhou, B ;
Rupert, KL ;
Chung, EK ;
Jones, KN ;
Sotos, JF ;
Zipf, WB ;
Rennebohm, RM ;
Yu, CY .
JOURNAL OF EXPERIMENTAL MEDICINE, 2000, 191 (12) :2183-2196
[4]   Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence [J].
Cheung, J ;
Estivill, X ;
Khaja, R ;
MacDonald, JR ;
Lau, K ;
Tsui, LC ;
Scherer, SW .
GENOME BIOLOGY, 2003, 4 (04)
[5]   Genetic sophistication of human complement components C4A and C4B and RP-C4-CYP21-TNX (RCCX) modules in the major histocompatibility complex [J].
Chung, EK ;
Yang, Y ;
Rennebohm, RM ;
Lokki, ML ;
Higgins, GC ;
Jones, KN ;
Zhou, B ;
Blanchong, CA ;
Yu, CY .
AMERICAN JOURNAL OF HUMAN GENETICS, 2002, 71 (04) :823-837
[6]   Segmental duplications: An 'expanding' role in genomic instability and disease [J].
Emanuel, BS ;
Shaikh, TH .
NATURE REVIEWS GENETICS, 2001, 2 (10) :791-800
[7]   Chromosomal regions containing high-density and ambiguously mapped putative single nucleotide polymorphisms (SNPs) correlate with segmental duplications in the human genome [J].
Estivill, X ;
Cheung, J ;
Pujana, MA ;
Nakabayashi, K ;
Scherer, SW ;
Tsui, LC .
HUMAN MOLECULAR GENETICS, 2002, 11 (17) :1987-1995
[8]   Gene content and function of the ancestral chromosome fusion site in human chromosome 2q13-2q14.1 and paralogous regions [J].
Fan, YX ;
Newman, T ;
Linardopoulou, E ;
Trask, BJ .
GENOME RESEARCH, 2002, 12 (11) :1663-1672
[9]  
Force A, 1999, GENETICS, V151, P1531
[10]  
GUMUCIO DL, 1985, J BIOL CHEM, V260, P3483