Utility of long-read sequencing for All of Us

被引:16
作者
Mahmoud, M. [1 ,2 ]
Huang, Y. [3 ]
Garimella, K. [3 ]
Audano, P. A. [4 ]
Wan, W. [3 ]
Prasad, N. [5 ]
Handsaker, R. E. [6 ,7 ]
Hall, S. [5 ]
Pionzio, A. [5 ]
Schatz, M. C. [8 ]
Talkowski, M. E. [7 ,9 ]
Eichler, E. E. [10 ,11 ]
Levy, S. E. [12 ]
Sedlazeck, F. J. [1 ,2 ,13 ]
机构
[1] Baylor Coll Med, Human Genome Sequencing Ctr, Houston, TX 77030 USA
[2] Baylor Coll Med, Dept Mol & Human Genet, Houston, TX 77030 USA
[3] Broad Inst MIT & Harvard, Data Sci Platform, Cambridge, MA 02141 USA
[4] Jackson Lab Genom Med, Farmington, CT 06032 USA
[5] Discovery Life Sci, Huntsville, AL 35806 USA
[6] Harvard Med Sch, Dept Genet, Boston, MA USA
[7] Broad Inst MIT & Harvard, Program Med & Populat Genet, Cambridge, MA 02141 USA
[8] Johns Hopkins Univ, Dept Comp Sci, Baltimore, MD USA
[9] Massachusetts Gen Hosp, Ctr Genom Med, Boston, MA USA
[10] Univ Washington, Sch Med, Dept Genome Sci, Seattle, WA USA
[11] Univ Washington, Howard Hughes Med Inst, Seattle, WA USA
[12] HudsonAlpha Inst Biotechnol, Huntsville, AL 35806 USA
[13] Rice Univ, Dept Comp Sci, Houston, TX 77005 USA
基金
美国国家卫生研究院;
关键词
MISSING HERITABILITY; DISEASES; GENOME;
D O I
10.1038/s41467-024-44804-3
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The All of Us (AoU) initiative aims to sequence the genomes of over one million Americans from diverse ethnic backgrounds to improve personalized medical care. In a recent technical pilot, we compare the performance of traditional short-read sequencing with long-read sequencing in a small cohort of samples from the HapMap project and two AoU control samples representing eight datasets. Our analysis reveals substantial differences in the ability of these technologies to accurately sequence complex medically relevant genes, particularly in terms of gene coverage and pathogenic variant identification. We also consider the advantages and challenges of using low coverage sequencing to increase sample numbers in large cohort analysis. Our results show that HiFi reads produce the most accurate results for both small and large variants. Further, we present a cloud-based pipeline to optimize SNV, indel and SV calling at scale for long-reads analysis. These results lead to widespread improvements across AoU. Using All of Us pilot data, the authors compared short- and long-read performance across medically relevant genes and showcased the utility of long reads to improve variant detection and phasing in easy and hard to resolve medically relevant genes.
引用
收藏
页数:13
相关论文
共 70 条
  • [1] Opportunities and challenges in long-read sequencing data analysis
    Amarasinghe, Shanika L.
    Su, Shian
    Dong, Xueyi
    Zappia, Luke
    Ritchie, Matthew E.
    Gouil, Quentin
    [J]. GENOME BIOLOGY, 2020, 21 (01)
  • [2] [Anonymous], GitHub - PacificBiosciences/pbsv: pbsv - PacBio structural variant (SV) calling and analysis tools
  • [3] [Anonymous], Illumina: Dragen bio-it platform
  • [4] Characterizing the Major Structural Variant Alleles of the Human Genome
    Audano, Peter A.
    Sulovari, Arvis
    Graves-Lindsay, Tina A.
    Cantsilieris, Stuart
    Sorensen, Melanie
    Welch, AnneMarie E.
    Dougherty, Max L.
    Nelson, Bradley J.
    Shah, Ankeeta
    Dutcher, Susan K.
    Warren, Wesley C.
    Magrini, Vincent
    McGrath, Sean D.
    Li, Yang I.
    Wilson, Richard K.
    Eichler, Evan E.
    [J]. CELL, 2019, 176 (03) : 663 - +
  • [5] Barnes M. R., 2007, Bioinformatics for Geneticists: A Bioinformatics Primer for the Analysis of Genetic Data
  • [6] Centers for Mendelian Genomics: A decade of facilitating gene discovery
    Baxter, Samantha M.
    Posey, Jennifer E.
    Lake, Nicole J.
    Sobreira, Nara
    Chong, Jessica X.
    Buyske, Steven
    Blue, Elizabeth E.
    Chadwick, Lisa H.
    Coban-Akdemir, Zeynep H.
    Doheny, Kimberly F.
    Davis, Colleen P.
    Lek, Monkol
    Wellington, Christopher
    Jhangiani, Shalini N.
    Gerstein, Mark
    Gibbs, Richard A.
    Lifton, Richard P.
    MacArthur, Daniel G.
    Matise, Tara C.
    Lupski, James R.
    Valle, David
    Bamshad, Michael J.
    Hamosh, Ada
    Mane, Shrikant
    Nickerson, Deborah A.
    Rehm, Heidi L.
    O'Donnell-Luria, Anne
    [J]. GENETICS IN MEDICINE, 2022, 24 (04) : 784 - 797
  • [7] Megabase Length Hypermutation Accompanies Human Structural Variation at 17p11.2
    Beck, Christine R.
    Carvalho, Claudia M. B.
    Akdemir, Zeynep C.
    Sedlazeck, Fritz J.
    Song, Xiaofei
    Meng, Qingchang
    Hu, Jianhong
    Doddapaneni, Harsha
    Chong, Zechen
    Chen, Edward S.
    Thornton, Philip C.
    Liu, Pengfei
    Yuan, Bo
    Withers, Marjorie
    Jhangiani, Shalini N.
    Kalra, Divya
    Walker, Kimberly
    English, Adam C.
    Han, Yi
    Chen, Ken
    Muzny, Donna M.
    Ira, Grzegorz
    Shaw, Chad A.
    Gibbs, Richard A.
    Hastings, P. J.
    Lupski, James R.
    [J]. CELL, 2019, 176 (06) : 1310 - +
  • [8] Behera S, 2022, bioRxiv, DOI [10.1101/2022.07.18.500506, 10.1101/2022.07.18.500506, DOI 10.1101/2022.07.18.500506]
  • [9] Billingsley K. J., 2022, bioRxiv, DOI [10.1101/2022.08.22.504867, DOI 10.1101/2022.08.22.504867]
  • [10] Multi-platform discovery of haplotype-resolved structural variation in human genomes
    Chaisson, Mark J. P.
    Sanders, Ashley D.
    Zhao, Xuefang
    Malhotra, Ankit
    Porubsky, David
    Rausch, Tobias
    Gardner, Eugene J.
    Rodriguez, Oscar L.
    Guo, Li
    Collins, Ryan L.
    Fan, Xian
    Wen, Jia
    Handsaker, Robert E.
    Fairley, Susan
    Kronenberg, Zev N.
    Kong, Xiangmeng
    Hormozdiari, Fereydoun
    Lee, Dillon
    Wenger, Aaron M.
    Hastie, Alex R.
    Antaki, Danny
    Anantharaman, Thomas
    Audano, Peter A.
    Brand, Harrison
    Cantsilieris, Stuart
    Cao, Han
    Cerveira, Eliza
    Chen, Chong
    Chen, Xintong
    Chin, Chen-Shan
    Chong, Zechen
    Chuang, Nelson T.
    Lambert, Christine C.
    Church, Deanna M.
    Clarke, Laura
    Farrell, Andrew
    Flores, Joey
    Galeev, Timur
    Gorkin, David U.
    Gujral, Madhusudan
    Guryev, Victor
    Heaton, William Haynes
    Korlach, Jonas
    Kumar, Sushant
    Kwon, Jee Young
    Lam, Ernest T.
    Lee, Jong Eun
    Lee, Joyce
    Lee, Wan-Ping
    Lee, Sau Peng
    [J]. NATURE COMMUNICATIONS, 2019, 10 (1)