A pangenome reference of 36 Chinese populations

被引:72
作者
Gao, Yang [1 ,2 ,3 ,4 ]
Yang, Xiaofei [5 ,6 ,7 ]
Chen, Hao [3 ]
Tan, Xinjiang [3 ]
Yang, Zhaoqing [8 ]
Deng, Lian [1 ]
Wang, Baonan [2 ]
Kong, Shuang [2 ]
Li, Songyang [2 ]
Cui, Yuhang [2 ]
Lei, Chang [1 ]
Wang, Yimin [3 ]
Pan, Yuwen [3 ]
Ma, Sen [3 ]
Sun, Hao [8 ]
Zhao, Xiaohan [2 ]
Shi, Yingbing [1 ]
Yang, Ziyi [1 ]
Wu, Dongdong [9 ]
Wu, Shaoyuan [10 ]
Zhao, Xingming [11 ]
Shi, Binyin [12 ]
Jin, Li [1 ,2 ]
Hu, Zhibin [13 ,14 ]
Lu, Yan [1 ]
Chu, Jiayou [8 ]
Ye, Kai [6 ,15 ,16 ]
Xu, Shuhua [1 ,2 ,4 ,10 ,17 ,18 ,19 ]
机构
[1] Fudan Univ, Sch Life Sci, Zhangjiang Fudan Int Innovat Ctr, Ctr Evolutionary Biol,Human Phenome Inst,State Ke, Shanghai, Peoples R China
[2] Fudan Univ, Collaborat Innovat Ctr Genet & Dev, Minist Educ, Key Lab Contemporary Anthropo, Shanghai, Peoples R China
[3] Univ Chinese Acad Sci, Chinese Acad Sci, Shanghai Inst Nutr & Hlth, Key Lab Computat Biol, Shanghai, Peoples R China
[4] ShanghaiTech Univ, Sch Life Sci & Technol, Shanghai, Peoples R China
[5] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Sch Comp Sci & Technol, Xian, Peoples R China
[6] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, MOE Key Lab Intelligent Networks & Networks Secur, Xian, Peoples R China
[7] Xi An Jiao Tong Univ, Genome Inst, Affiliated Hosp 1, Xian, Peoples R China
[8] Chinese Acad Med Sci, Inst Med Biol, Dept Med Genet, Kunming, Yunnan, Peoples R China
[9] Chinese Acad Sci, Kunming Inst Zool, State Key Lab Genet Resources & Evolut, Kunming, Peoples R China
[10] Jiangsu Normal Univ, Int Joint Ctr Genom Jiangsu Prov Sch Life Sci, Jiangsu Key Lab Phylogen & Comparat Genom, Xuzhou, Peoples R China
[11] Fudan Univ, Inst Sci & Technol Brain Inspired Intelligence, Minist Educ, Key MOE Lab Computat Neurosci & Brain Inspired In, Shanghai, Peoples R China
[12] Xi An Jiao Tong Univ, Affiliated Hosp 1, Dept Endocrinol, Xian, Peoples R China
[13] Nanjing Med Univ, State Key Lab Reprod Med, Nanjing, Peoples R China
[14] Nanjing Med Univ, Collaborat Innovat Ctr Canc Personalized Med, Ctr Global Hlth, Sch Publ Hlth,Jiangsu Key Lab Canc Biomarkers, Nanjing, Peoples R China
[15] Xi An Jiao Tong Univ, Fac Elect & Informat Engn, Sch Automat Sci & Engn, Xian, Peoples R China
[16] Xi An Jiao Tong Univ, Sch Life Sci & Technol, Xian, Peoples R China
[17] Fudan Univ, Dept Liver Surg, Shanghai, Peoples R China
[18] Fudan Univ, Zhongshan Hosp, Transplantat Liver Canc Inst, Shanghai, Peoples R China
[19] Chinese Acad Sci, Ctr Excellence Anim Evolut & Genet, Kunming, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
HUMAN GENETIC DIVERSITY; CFC1; MUTATIONS; GENOME; REVEALS; ASSOCIATION; PRINCIPLES; SEQUENCE;
D O I
10.1038/s41586-023-06173-7
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Human genomics is witnessing an ongoing paradigm shift from a single reference sequence to a pangenome form, but populations of Asian ancestry are underrepresented. Here we present data from the first phase of the Chinese Pangenome Consortium, including a collection of 116 high-quality and haplotype-phased de novo assemblies based on 58 core samples representing 36 minority Chinese ethnic groups. With an average 30.65x high-fidelity long-read sequence coverage, an average contiguity N50 of more than 35.63 megabases and an average total size of 3.01 gigabases, the CPC core assemblies add 189 million base pairs of euchromatic polymorphic sequences and 1,367 protein-coding gene duplications to GRCh38. We identified 15.9 million small variants and 78,072 structural variants, of which 5.9 million small variants and 34,223 structural variants were not reported in a recently released pangenome reference(1). The Chinese Pangenome Consortium data demonstrate a remarkable increase in the discovery of novel and missing sequences when individuals are included from underrepresented minority ethnic groups. The missing reference sequences were enriched with archaic-derived alleles and genes that confer essential functions related to keratinization, response to ultraviolet radiation, DNA repair, immunological responses and lifespan, implying great potential for shedding new light on human evolution and recovering missing heritability in complex disease mapping.
引用
收藏
页码:112 / 134
页数:23
相关论文
共 65 条
[1]   Mapping Human Genetic Diversity in Asia [J].
Abdulla, Mahmood Ameen ;
Ahmed, Ikhlak ;
Assawamakin, Anunchai ;
Bhak, Jong ;
Brahmachari, Samir K. ;
Calacal, Gayvelline C. ;
Chaurasia, Amit ;
Chen, Chien-Hsiun ;
Chen, Jieming ;
Chen, Yuan-Tsong ;
Chu, Jiayou ;
Cutiongco-de la Paz, Eva Maria C. ;
De Ungria, Maria Corazon A. ;
Delfin, Frederick C. ;
Edo, Juli ;
Fuchareon, Suthat ;
Ghang, Ho ;
Gojobori, Takashi ;
Han, Junsong ;
Ho, Sheng-Feng ;
Hoh, Boon Peng ;
Huang, Wei ;
Inoko, Hidetoshi ;
Jha, Pankaj ;
Jinam, Timothy A. ;
Jin, Li ;
Jung, Jongsun ;
Kangwanpong, Daoroong ;
Kampuansai, Jatupol ;
Kennedy, Giulia C. ;
Khurana, Preeti ;
Kim, Hyung-Lae ;
Kim, Kwangjoong ;
Kim, Sangsoo ;
Kim, Woo-Yeon ;
Kimm, Kuchan ;
Kimura, Ryosuke ;
Koike, Tomohiro ;
Kulawonganunchai, Supasak ;
Kumar, Vikrant ;
Lai, Poh San ;
Lee, Jong-Young ;
Lee, Sunghoon ;
Liu, Edison T. ;
Majumder, Partha P. ;
Mandapati, Kiran Kumar ;
Marzuki, Sangkot ;
Mitchell, Wayne ;
Mukerji, Mitali ;
Naritomi, Kenji .
SCIENCE, 2009, 326 (5959) :1541-1545
[2]   Progressive Cactus is a multiple-genome aligner for the thousand-genome era [J].
Armstrong, Joel ;
Hickey, Glenn ;
Diekhans, Mark ;
Fiddes, Ian T. ;
Novak, Adam M. ;
Deran, Alden ;
Fang, Qi ;
Xie, Duo ;
Feng, Shaohong ;
Stiller, Josefin ;
Genereux, Diane ;
Johnson, Jeremy ;
Marinescu, Voichita Dana ;
Alfoldi, Jessica ;
Harris, Robert S. ;
Lindblad-Toh, Kerstin ;
Haussler, David ;
Karlsson, Elinor ;
Jarvis, Erich D. ;
Zhang, Guojie ;
Paten, Benedict .
NATURE, 2020, 587 (7833) :246-+
[3]   Characterizing the Major Structural Variant Alleles of the Human Genome [J].
Audano, Peter A. ;
Sulovari, Arvis ;
Graves-Lindsay, Tina A. ;
Cantsilieris, Stuart ;
Sorensen, Melanie ;
Welch, AnneMarie E. ;
Dougherty, Max L. ;
Nelson, Bradley J. ;
Shah, Ankeeta ;
Dutcher, Susan K. ;
Warren, Wesley C. ;
Magrini, Vincent ;
McGrath, Sean D. ;
Li, Yang I. ;
Wilson, Richard K. ;
Eichler, Evan E. .
CELL, 2019, 176 (03) :663-+
[4]   Involvement of SPATA31 copy number variable genes in human lifespan [J].
Bekpen, Cemalettin ;
Xie, Chen ;
Nebel, Almut ;
Tautz, Diethard .
AGING-US, 2018, 10 (04) :674-688
[5]   GeneAnalytics: An Integrative Gene Set Analysis Tool for Next Generation Sequencing, RNAseq and Microarray Data [J].
Ben-Ari Fuchs, Shani ;
Lieder, Iris ;
Stelzer, Gil ;
Mazor, Yaron ;
Buzhor, Ella ;
Kaplan, Sergey ;
Bogoch, Yoel ;
Plaschkes, Inbar ;
Shitrit, Alina ;
Rappaport, Noa ;
Kohn, Asher ;
Edgar, Ron ;
Shenhav, Liraz ;
Safran, Marilyn ;
Lancet, Doron ;
Guan-Golan, Yaron ;
Warshawsky, David ;
Shtrichman, Ronit .
OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY, 2016, 20 (03) :139-151
[6]   Evaluating the promise of inclusion of African ancestry populations in genomics [J].
Bentley, Amy R. ;
Callier, Shawneequa L. ;
Rotimi, Charles N. .
NPJ GENOMIC MEDICINE, 2020, 5 (01)
[7]   The first gapless, reference-quality, fully annotated genome from a Southern Han Chinese individual [J].
Chao, Kuan-Hao ;
Zimin, Aleksey, V ;
Pertea, Mihaela ;
Salzberg, Steven L. .
G3-GENES GENOMES GENETICS, 2023, 13 (03)
[8]   Accurate long-read de novo assembly evaluation with Inspector [J].
Chen, Yu ;
Zhang, Yixin ;
Wang, Amy Y. ;
Gao, Min ;
Chong, Zechen .
GENOME BIOLOGY, 2021, 22 (01)
[9]   Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm [J].
Cheng, Haoyu ;
Concepcion, Gregory T. ;
Feng, Xiaowen ;
Zhang, Haowen ;
Li, Heng .
NATURE METHODS, 2021, 18 (02) :170-+
[10]   Genomic analyses of 10,376 individuals in the Westlake BioBank for Chinese (WBBC) pilot project [J].
Cong, Pei-Kuan ;
Bai, Wei-Yang ;
Li, Jin-Chen ;
Yang, Meng-Yuan ;
Khederzadeh, Saber ;
Gai, Si-Rui ;
Li, Nan ;
Liu, Yu-Heng ;
Yu, Shi-Hui ;
Zhao, Wei-Wei ;
Liu, Jun-Quan ;
Sun, Yi ;
Zhu, Xiao-Wei ;
Zhao, Pian-Pian ;
Xia, Jiang-Wei ;
Guan, Peng-Lin ;
Qian, Yu ;
Tao, Jian-Guo ;
Xu, Lin ;
Tian, Geng ;
Wang, Ping-Yu ;
Xie, Shu-Yang ;
Qiu, Mo-Chang ;
Liu, Ke-Qi ;
Tang, Bei-Sha ;
Zheng, Hou-Feng .
NATURE COMMUNICATIONS, 2022, 13 (01)