Genome sequence of the cultivated cotton Gossypium arboreum

被引:724
|
作者
Li, Fuguang [1 ]
Fan, Guangyi [2 ]
Wang, Kunbo [1 ]
Sun, Fengming [2 ]
Yuan, Youlu [1 ]
Song, Guoli [1 ]
Li, Qin [3 ]
Ma, Zhiying [4 ]
Lu, Cairui [1 ]
Zou, Changsong [1 ]
Chen, Wenbin [2 ]
Liang, Xinming [2 ]
Shang, Haihong [1 ]
Liu, Weiqing [2 ]
Shi, Chengcheng [2 ]
Xiao, Guanghui [3 ]
Gou, Caiyun [2 ]
Ye, Wuwei [1 ]
Xu, Xun [2 ]
Zhang, Xueyan [1 ]
Wei, Hengling [1 ]
Li, Zhifang [1 ]
Zhang, Guiyin [4 ]
Wang, Junyi [2 ]
Liu, Kun [1 ]
Kohel, Russell J. [5 ]
Percy, Richard G. [5 ]
Yu, John Z. [5 ]
Zhu, Yu-Xian [3 ]
Wang, Jun [2 ,6 ,7 ,8 ,9 ,10 ]
Yu, Shuxun [1 ]
机构
[1] Chinese Acad Agr Sci, Inst Cotton Res, State Key Lab Cotton Biol, Anyang, Peoples R China
[2] BGI Shenzhen, Shenzhen, Peoples R China
[3] Peking Univ, Coll Life Sci, State Key Lab Prot & Plant Gene Res, Beijing 100871, Peoples R China
[4] Agr Univ Hebei, Key Lab Crop Germplasm Resources Hebei, Baoding, Peoples R China
[5] USDA ARS, Crop Germplasm Res Unit, Southern Plains Agr Res Ctr, College Stn, TX USA
[6] Univ Copenhagen, Dept Biol, Copenhagen, Denmark
[7] King Abdulaziz Univ, Jeddah 21413, Saudi Arabia
[8] Macau Univ Sci & Technol, Macau, Peoples R China
[9] Univ Hong Kong, Dept Med, Hong Kong, Hong Kong, Peoples R China
[10] Univ Hong Kong, State Key Lab Pharmaceut Biotechnol, Hong Kong, Hong Kong, Peoples R China
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
CELL ELONGATION; DRAFT GENOME; EVOLUTION; ARABIDOPSIS; FIBER; BIOSYNTHESIS; DYNAMICS; ETHYLENE; SORGHUM; GENUS;
D O I
10.1038/ng.2987
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
The complex allotetraploid nature of the cotton genome (AADD; 2n = 52) makes genetic, genomic and functional analyses extremely challenging. Here we sequenced and assembled the Gossypium arboreum (AA; 2n = 26) genome, a putative contributor of the A subgenome. A total of 193.6 Gb of clean sequence covering the genome by 112.6-fold was obtained by paired-end sequencing. We further anchored and oriented 90.4% of the assembly on 13 pseudochromosomes and found that 68.5% of the genome is occupied by repetitive DNA sequences. We predicted 41,330 protein-coding genes in G. arboreum. Two whole-genome duplications were shared by G. arboreum and Gossypium raimondii before speciation. Insertions of long terminal repeats in the past 5 million years are responsible for the twofold difference in the sizes of these genomes. Comparative transcriptome studies showed the key role of the nucleotide binding site (NBS)-encoding gene family in resistance to Verticillium dahliae and the involvement of ethylene in the development of cotton fiber cells.
引用
收藏
页码:567 / 572
页数:6
相关论文
empty
未找到相关数据