GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing

被引:5
作者
Valls-Margarit, Jordi [1 ]
Galvan-Femenia, Ivan [2 ,15 ]
Matias-Sanchez, Daniel [1 ]
Blay, Natalia [2 ]
Puiggros, Montserrat [1 ]
Carreras, Anna [2 ]
Salvoro, Cecilia [1 ]
Cortes, Beatriz [2 ]
Amela, Ramon [1 ]
Farre, Xavier [2 ]
Lerga-Jaso, Jon [3 ]
Puig, Marta [3 ]
Sanchez-Herrero, Jose Francisco [4 ]
Moreno, Victor [5 ,6 ,7 ,8 ]
Perucho, Manuel [9 ,10 ]
Sumoy, Lauro [4 ]
Armengol, Lluis [11 ]
Delaneau, Olivier [12 ,13 ]
Caceres, Mario [3 ,14 ]
de Cid, Rafael [2 ]
Torrents, David [1 ,14 ]
机构
[1] Barcelona Supercomp Ctr BSC, Life Sci Dept, Barcelona 08034, Spain
[2] Inst Hlth Sci Res Germans Trias & Pujol IGTP, Genomes Life GCAT Lab Grp, Badalona 08916, Spain
[3] Univ Autonoma Barcelona, Inst Biotecnol & Biomed, Barcelona 08193, Spain
[4] Inst Hlth Sci Res Germans Trias & Pujol IGTP, High Content Genom & Bioinformat Unit, Badalona 08916, Spain
[5] Catalan Inst Oncol, Lhospitalet De Llobregat 08908, Spain
[6] Bellvitge Biomed Res Inst IDIBELL, Lhospitalet De Llobregat 08908, Spain
[7] CIBER Epidemiol & Salud Publ CIBERESP, Madrid 28029, Spain
[8] Univ Barcelona UB, Barcelona 08007, Spain
[9] Sanford Burnham Prebys Med Discovery Inst SBP, La Jolla, CA 92037 USA
[10] Hlth Sci Res Inst Germans Trias & Pujol IGTP, Program Predict & Personalized Med Canc PMPPC, Canc Genet & Epigenet, Badalona 08916, Spain
[11] Quantitat Genom Med Labs qGen, Esplugues Del Llobregat 08950, Spain
[12] Univ Lausanne, Dept Computat Biol, CH-1015 Lausanne, Switzerland
[13] Univ Lausanne, Swiss Inst Bioinformat SIB, Quartier Sorge Batiment Amphipole, CH-1015 Lausanne, Switzerland
[14] ICREA, Barcelona 08010, Spain
[15] Barcelona Inst Sci & Technol, Inst Res Biomed IRB Barcelona, Barcelona 08028, Spain
基金
欧盟地平线“2020”;
关键词
DROSOPHILA-MELANOGASTER; GENOTYPE; CANCER; DISCOVERY; PROGRAM;
D O I
10.1093/nar/gkac076
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (>= 50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.
引用
收藏
页码:2464 / 2479
页数:16
相关论文
共 74 条
  • [1] Mapping and characterization of structural variation in 17,795 human genomes
    Abel, Haley J.
    Larson, David E.
    Regier, Allison A.
    Chiang, Colby
    Das, Indraniel
    Kanchi, Krishna L.
    Layer, Ryan M.
    Neale, Benjamin M.
    Salerno, William J.
    Reeves, Catherine
    Buyske, Steven
    Matise, Tara C.
    Muzny, Donna M.
    Zody, Michael C.
    Lander, Eric S.
    Dutcher, Susan K.
    Stitziel, Nathan O.
    Hall, Ira M.
    [J]. NATURE, 2020, 583 (7814) : 83 - +
  • [2] CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing
    Abyzov, Alexej
    Urban, Alexander E.
    Snyder, Michael
    Gerstein, Mark
    [J]. GENOME RESEARCH, 2011, 21 (06) : 974 - 984
  • [3] A haplotype map of the human genome
    Altshuler, D
    Brooks, LD
    Chakravarti, A
    Collins, FS
    Daly, MJ
    Donnelly, P
    Gibbs, RA
    Belmont, JW
    Boudreau, A
    Leal, SM
    Hardenbol, P
    Pasternak, S
    Wheeler, DA
    Willis, TD
    Yu, FL
    Yang, HM
    Zeng, CQ
    Gao, Y
    Hu, HR
    Hu, WT
    Li, CH
    Lin, W
    Liu, SQ
    Pan, H
    Tang, XL
    Wang, J
    Wang, W
    Yu, J
    Zhang, B
    Zhang, QR
    Zhao, HB
    Zhao, H
    Zhou, J
    Gabriel, SB
    Barry, R
    Blumenstiel, B
    Camargo, A
    Defelice, M
    Faggart, M
    Goyette, M
    Gupta, S
    Moore, J
    Nguyen, H
    Onofrio, RC
    Parkin, M
    Roy, J
    Stahl, E
    Winchester, E
    Ziaugra, L
    Shen, Y
    [J]. NATURE, 2005, 437 (7063) : 1299 - 1320
  • [4] Characterizing the Major Structural Variant Alleles of the Human Genome
    Audano, Peter A.
    Sulovari, Arvis
    Graves-Lindsay, Tina A.
    Cantsilieris, Stuart
    Sorensen, Melanie
    Welch, AnneMarie E.
    Dougherty, Max L.
    Nelson, Bradley J.
    Shah, Ankeeta
    Dutcher, Susan K.
    Warren, Wesley C.
    Magrini, Vincent
    McGrath, Sean D.
    Li, Yang I.
    Wilson, Richard K.
    Eichler, Evan E.
    [J]. CELL, 2019, 176 (03) : 663 - +
  • [5] FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods
    Becker, Timothy
    Lee, Wan-Ping
    Leone, Joseph
    Zhu, Qihui
    Zhang, Chengsheng
    Liu, Silvia
    Sargent, Jack
    Shanker, Kritika
    Mil-homens, Adam
    Cerveira, Eliza
    Ryan, Mallory
    Cha, Jane
    Navarro, Fabio C. P.
    Galeev, Timur
    Gerstein, Mark
    Mills, Ryan E.
    Shin, Dong-Guk
    Lee, Charles
    Malhotra, Ankit
    [J]. GENOME BIOLOGY, 2018, 19
  • [6] Paternally inherited cis-regulatory structural variants are associated with autism
    Brandler, William M.
    Antaki, Danny
    Gujral, Madhusudan
    Kleiber, Morgan L.
    Whitney, Joe
    Maile, Michelle S.
    Hong, Oanh
    Chapman, Timothy R.
    Tan, Shirley
    Tandon, Prateek
    Pang, Timothy
    Tang, Shih C.
    Vaux, Keith K.
    Yang, Yan
    Harrington, Eoghan
    Juul, Sissel
    Turner, Daniel J.
    Thiruvahindrapuram, Bhooma
    Kaur, Gaganjot
    Wang, Zhuozhi
    Kingsmore, Stephen F.
    Gleeson, Joseph G.
    Bisson, Denis
    Kakaradov, Boyko
    Telenti, Amalio
    Venter, J. Craig
    Corominas, Roser
    Toma, Claudio
    Cormand, Bru
    Rueda, Isabel
    Guijarro, Silvina
    Messer, Karen S.
    Nievergelt, Caroline M.
    Arranz, Maria J.
    Courchesne, Eric
    Pierce, Karen
    Muotri, Alysson R.
    Iakoucheva, Lilia M.
    Hervas, Amaia
    Scherer, Stephen W.
    Corsello, Christina
    Sebat, Jonathan
    [J]. SCIENCE, 2018, 360 (6386) : 327 - 330
  • [7] Multi-platform discovery of haplotype-resolved structural variation in human genomes
    Chaisson, Mark J. P.
    Sanders, Ashley D.
    Zhao, Xuefang
    Malhotra, Ankit
    Porubsky, David
    Rausch, Tobias
    Gardner, Eugene J.
    Rodriguez, Oscar L.
    Guo, Li
    Collins, Ryan L.
    Fan, Xian
    Wen, Jia
    Handsaker, Robert E.
    Fairley, Susan
    Kronenberg, Zev N.
    Kong, Xiangmeng
    Hormozdiari, Fereydoun
    Lee, Dillon
    Wenger, Aaron M.
    Hastie, Alex R.
    Antaki, Danny
    Anantharaman, Thomas
    Audano, Peter A.
    Brand, Harrison
    Cantsilieris, Stuart
    Cao, Han
    Cerveira, Eliza
    Chen, Chong
    Chen, Xintong
    Chin, Chen-Shan
    Chong, Zechen
    Chuang, Nelson T.
    Lambert, Christine C.
    Church, Deanna M.
    Clarke, Laura
    Farrell, Andrew
    Flores, Joey
    Galeev, Timur
    Gorkin, David U.
    Gujral, Madhusudan
    Guryev, Victor
    Heaton, William Haynes
    Korlach, Jonas
    Kumar, Sushant
    Kwon, Jee Young
    Lam, Ernest T.
    Lee, Jong Eun
    Lee, Joyce
    Lee, Wan-Ping
    Lee, Sau Peng
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)
  • [8] Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications
    Chen, Xiaoyu
    Schulz-Trieglaff, Ole
    Shaw, Richard
    Barnes, Bret
    Schlesinger, Felix
    Kallberg, Morten
    Cox, Anthony J.
    Kruglyakl, Semyon
    Saunders, Christopher T.
    [J]. BIOINFORMATICS, 2016, 32 (08) : 1220 - 1222
  • [9] The prevalence of chronic conditions and medical expenditures of the elderly by chronic condition indicator (CCI)
    Chi, Mei-ju
    Lee, Cheng-yi
    Wu, Shwu-chong
    [J]. ARCHIVES OF GERONTOLOGY AND GERIATRICS, 2011, 52 (03) : 284 - 289
  • [10] A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3
    Cingolani, Pablo
    Platts, Adrian
    Wang, Le Lily
    Coon, Melissa
    Tung Nguyen
    Wang, Luan
    Land, Susan J.
    Lu, Xiangyi
    Ruden, Douglas M.
    [J]. FLY, 2012, 6 (02) : 80 - 92