Cotton pan-genome retrieves the lost sequences and genes during domestication and selection

被引:106
作者
Li, Jianying [1 ]
Yuan, Daojun [2 ]
Wang, Pengcheng [1 ]
Wang, Qiongqiong [1 ]
Sun, Mengling [1 ]
Liu, Zhenping [1 ]
Si, Huan [1 ]
Xu, Zhongping [1 ]
Ma, Yizan [1 ]
Zhang, Boyang [1 ]
Pei, Liuling [1 ]
Tu, Lili [1 ]
Zhu, Longfu [1 ]
Chen, Ling-Ling [3 ]
Lindsey, Keith [4 ]
Zhang, Xianlong [1 ]
Jin, Shuangxia [1 ]
Wang, Maojun [1 ]
机构
[1] Huazhong Agr Univ, Natl Key Lab Crop Genet Improvement, Wuhan, Peoples R China
[2] Huazhong Agr Univ, Coll Plant Sci & Technol, Wuhan, Peoples R China
[3] Huazhong Agr Univ, Coll Informat, Hubei Key Lab Agr Bioinformat, Wuhan, Peoples R China
[4] Univ Durham, Dept Biosci, Durham, England
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
Cotton; Domestication; Improvement; Pan-genome; Copy number variation (CNV); Presence; absence variation (PAV); Gene loss; POPULATION-STRUCTURE; FIBER QUALITY; ASSOCIATION; DIVERSITY; INSIGHTS; REVEAL; RICE; WILD; DIFFERENTIATION; DIVERGENCE;
D O I
10.1186/s13059-021-02351-w
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
Background Millennia of directional human selection has reshaped the genomic architecture of cultivated cotton relative to wild counterparts, but we have limited understanding of the selective retention and fractionation of genomic components. Results We construct a comprehensive genomic variome based on 1961 cottons and identify 456 Mb and 357 Mb of sequence with domestication and improvement selection signals and 162 loci, 84 of which are novel, including 47 loci associated with 16 agronomic traits. Using pan-genome analyses, we identify 32,569 and 8851 non-reference genes lost from Gossypium hirsutum and Gossypium barbadense reference genomes respectively, of which 38.2% (39,278) and 14.2% (11,359) of genes exhibit presence/absence variation (PAV). We document the landscape of PAV selection accompanied by asymmetric gene gain and loss and identify 124 PAVs linked to favorable fiber quality and yield loci. Conclusions This variation repertoire points to genomic divergence during cotton domestication and improvement, which informs the characterization of favorable gene alleles for improved breeding practice using a pan-genome-based approach.
引用
收藏
页数:26
相关论文
共 93 条
[21]   Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure [J].
Gordon, Sean P. ;
Contreras-Moreira, Bruno ;
Woods, Daniel P. ;
Marais, David L. Des ;
Burgess, Diane ;
Shu, Shengqiang ;
Stritt, Christoph ;
Roulin, Anne C. ;
Schackwitz, Wendy ;
Tyler, Ludmila ;
Martin, Joel ;
Lipzen, Anna ;
Dochy, Niklas ;
Phillips, Jeremy ;
Barry, Kerrie ;
Geuten, Koen ;
Budak, Hikmet ;
Juenger, Thomas E. ;
Amasino, Richard ;
Caicedo, Ana L. ;
Goodstein, David ;
Davidson, Patrick ;
Mur, Luis A. J. ;
Figueroa, Melania ;
Freeling, Michael ;
Catalan, Pilar ;
Vogel, John P. .
NATURE COMMUNICATIONS, 2017, 8
[22]   Genetic Diversity of the Two Commercial Tetraploid Cotton Species in the Gossypium Diversity Reference Set [J].
Hinze, Lori L. ;
Gazave, Elodie ;
Gore, Michael A. ;
Fang, David D. ;
Scheffler, Brian E. ;
Yu, John Z. ;
Jones, Don C. ;
Frelichowski, James ;
Percy, Richard G. .
JOURNAL OF HEREDITY, 2016, 107 (03) :274-286
[23]   MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects [J].
Holt, Carson ;
Yandell, Mark .
BMC BIOINFORMATICS, 2011, 12
[24]   Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton [J].
Hu, Yan ;
Chen, Jiedan ;
Fang, Lei ;
Zhang, Zhiyuan ;
Ma, Wei ;
Niu, Yongchao ;
Ju, Longzhen ;
Deng, Jieqiong ;
Zhao, Ting ;
Lian, Jinmin ;
Baruch, Kobi ;
Fang, David ;
Liu, Xia ;
Ruan, Yong-ling ;
Rahman, Mehboob-ur ;
Han, Jinlei ;
Wang, Kai ;
Wang, Qiong ;
Wu, Huaitong ;
Mei, Gaofu ;
Zang, Yihao ;
Han, Zegang ;
Xu, Chenyu ;
Shen, Weijuan ;
Yang, Duofeng ;
Si, Zhanfeng ;
Dai, Fan ;
Zou, Liangfeng ;
Huang, Fei ;
Bai, Yulin ;
Zhang, Yugao ;
Brodt, Avital ;
Ben-Hamo, Hilla ;
Zhu, Xiefei ;
Zhou, Baoliang ;
Guan, Xueying ;
Zhu, Shuijin ;
Chen, Xiaoya ;
Zhang, Tianzhen .
NATURE GENETICS, 2019, 51 (04) :739-+
[25]   Population structure and genetic basis of the agronomic traits of upland cotton in China revealed by a genome-wide association study using high-density SNPs [J].
Huang, Cong ;
Nie, Xinhui ;
Shen, Chao ;
You, Chunyuan ;
Li, Wu ;
Zhao, Wenxia ;
Zhang, Xianlong ;
Lin, Zhongxu .
PLANT BIOTECHNOLOGY JOURNAL, 2017, 15 (11) :1374-1386
[26]   Recent Advances and Future Perspectives in Cotton Research [J].
Huang, Gai ;
Huang, Jin-Quan ;
Chen, Xiao-Ya ;
Zhu, Yu-Xian .
ANNUAL REVIEW OF PLANT BIOLOGY, VOL 72, 2021, 2021, 72 :437-462
[27]   Genome sequence of Gossypium herbaceum and genome updates of Gossypium arboreum and Gossypium hirsutum provide insights into cotton A-genome evolution [J].
Huang, Gai ;
Wu, Zhiguo ;
Percy, Richard G. ;
Bai, Mingzhou ;
Li, Yang ;
Frelichowski, James E. ;
Hu, Jiang ;
Wang, Kun ;
Yu, John Z. ;
Zhu, Yuxian .
NATURE GENETICS, 2020, 52 (05) :516-+
[28]   A map of rice genome variation reveals the origin of cultivated rice [J].
Huang, Xuehui ;
Kurata, Nori ;
Wei, Xinghua ;
Wang, Zi-Xuan ;
Wang, Ahong ;
Zhao, Qiang ;
Zhao, Yan ;
Liu, Kunyan ;
Lu, Hengyun ;
Li, Wenjun ;
Guo, Yunli ;
Lu, Yiqi ;
Zhou, Congcong ;
Fan, Danlin ;
Weng, Qijun ;
Zhu, Chuanrang ;
Huang, Tao ;
Zhang, Lei ;
Wang, Yongchun ;
Feng, Lei ;
Furuumi, Hiroyasu ;
Kubo, Takahiko ;
Miyabayashi, Toshie ;
Yuan, Xiaoping ;
Xu, Qun ;
Dong, Guojun ;
Zhan, Qilin ;
Li, Canyang ;
Fujiyama, Asao ;
Toyoda, Atsushi ;
Lu, Tingting ;
Feng, Qi ;
Qian, Qian ;
Li, Jiayang ;
Han, Bin .
NATURE, 2012, 490 (7421) :497-+
[29]   Sunflower pan-genome analysis shows that hybridization altered gene content and disease resistance [J].
Hubner, Sariel ;
Bercovich, Natalia ;
Todesco, Marco ;
Mandel, Jennifer R. ;
Odenheimer, Jens ;
Ziegler, Emanuel ;
Lee, Joon S. ;
Baute, Gregory J. ;
Owens, Gregory L. ;
Grassa, Christopher J. ;
Ebert, Daniel P. ;
Ostevik, Katherine L. ;
Moyers, Brook T. ;
Yakimowski, Sarah ;
Masalia, Rishi R. ;
Gao, Lexuan ;
Calic, Irina ;
Bowers, John E. ;
Kane, Nolan C. ;
Swanevelder, Dirk Z. H. ;
Kubach, Timo ;
Munos, Stephane ;
Langlade, Nicolas B. ;
Burke, John M. ;
Rieseberg, Loren H. .
NATURE PLANTS, 2019, 5 (01) :54-62
[30]   CLUMPP: a cluster matching and permutation program for dealing with label switching and multimodality in analysis of population structure [J].
Jakobsson, Mattias ;
Rosenberg, Noah A. .
BIOINFORMATICS, 2007, 23 (14) :1801-1806