Investigating the Performance of Oxford Nanopore Long-Read Sequencing with Respect to Illumina Microarrays and Short-Read Sequencing

被引:0
作者
Santos, Renato [1 ]
Lee, Hyunah [2 ]
Williams, Alexander [2 ]
Baffour-Kyei, Anastasia [2 ]
Lee, Sang-Hyuck [2 ]
Troakes, Claire [3 ]
Al-Chalabi, Ammar [3 ]
Breen, Gerome [2 ]
Iacoangeli, Alfredo [1 ,3 ,4 ,5 ]
机构
[1] Kings Coll London, Dept Biostat & Hlth Informat, Inst Psychiat Psychol & Neurosci, 16 De Crespigny Pk, London SE5 8AB, England
[2] Kings Coll London, Social Genet & Dev Psychiat Ctr, Inst Psychiat Psychol & Neurosci, 16 De Crespigny Pk, London SE5 8AB, England
[3] Kings Coll London, Inst Psychiat Psychol & Neurosci, Dept Basic & Clin Neurosci, 5 Cutcombe Rd, London SE5 9RX, England
[4] Ground RR Block QE Med Ctr Ralph & Patricia Sarich, Perron Inst Neurol & Translat Sci, 8 Verdun St, Nedlands, WA 6009, Australia
[5] South London & Maudsley NHS Fdn Trust, NIHR Maudsley Biomed Res Ctr BRC, 16 De Crespigny Pk, London SE5 8AF, England
基金
英国经济与社会研究理事会; 英国医学研究理事会; 欧盟地平线“2020”; 欧洲研究理事会; 英国工程与自然科学研究理事会;
关键词
Oxford Nanopore Technologies; long-read sequencing; short-read sequencing; variant calling; benchmark; genomic variants; low-complexity regions; multiplexing; experimental variables; MEDICAL GENETICS; AMERICAN-COLLEGE; ASSOCIATION; GUIDELINES; STANDARDS; GENOMICS;
D O I
10.3390/ijms26104492
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Oxford Nanopore Technologies (ONT) long-read sequencing (LRS) has emerged as a promising genomic analysis tool, yet comprehensive benchmarks with established platforms across diverse datasets remain limited. This study aimed to benchmark LRS performance against Illumina short-read sequencing (SRS) and microarrays for variant detection across different genomic contexts and to evaluate the impact of experimental factors. We sequenced 14 human genomes using the three platforms and evaluated single nucleotide variants (SNVs), insertions/deletions (indels), and structural variants (SVs) detection, stratifying by high-complexity, low-complexity, and dark genome regions while assessing effects of multiplexing, depth, and read length. LRS SNV accuracy was slightly lower than that of SRS in high-complexity regions (F-measure: 0.954 vs. 0.967) but showed comparable sensitivity in low-complexity regions. LRS showed robust performance for small (1-5 bp) indels in high-complexity regions (F-measure: 0.869), but SRS agreement decreased significantly in low-complexity regions and for larger indel sizes. Within dark regions, LRS identified more indels than SRS, but showed lower base-level accuracy. LRS identified 2.86 times more SVs than SRS, excelling at detecting large variants (>6 kb), with SV detection improving with sequencing depth. Sequencing depth strongly influenced variant calling performance, whereas multiplexing effects were minimal. Our findings provide valuable insights for optimising LRS applications in genomic research and diagnostics.
引用
收藏
页数:31
相关论文
共 71 条
[1]   A signal processing and deep learning framework for methylation detection using Oxford Nanopore sequencing [J].
Ahsan, Mian Umair ;
Gouru, Anagha ;
Chan, Joe ;
Zhou, Wanding ;
Wang, Kai .
NATURE COMMUNICATIONS, 2024, 15 (01)
[2]   NanoCaller for accurate detection of SNPs and indels in difficult-to-map regions from long-read sequencing by haplotype-aware deep neural networks [J].
Ahsan, Mian Umair ;
Liu, Qian ;
Fang, Li ;
Wang, Kai .
GENOME BIOLOGY, 2021, 22 (01)
[3]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[4]   Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure [J].
Balague-Dobon, Laura ;
Caceres, Alejandro ;
Gonzalez, Juan R. .
BRIEFINGS IN BIOINFORMATICS, 2022, 23 (02)
[5]   Systematic benchmark of state-of-the-art variant calling pipelines identifies major factors affecting accuracy of coding sequence variant discovery [J].
Barbitoff, Yury A. ;
Abasov, Ruslan ;
Tvorogova, Varvara E. ;
Glotov, Andrey S. ;
Predeus, Alexander V. .
BMC GENOMICS, 2022, 23 (01)
[6]   HTSlib: C library for reading/writing high-throughput sequencing data [J].
Bonfield, James K. ;
Marshall, John ;
Danecek, Petr ;
Li, Heng ;
Ohan, Valeriu ;
Whitwham, Andrew ;
Keane, Thomas ;
Davies, Robert M. .
GIGASCIENCE, 2021, 10 (02)
[7]   An assessment of bioinformatics tools for the detection of human endogenous retroviral insertions in short-read genome sequencing data [J].
Bowles, Harry ;
Kabiljo, Renata ;
Al Khleifat, Ahmad ;
Jones, Ashley ;
Quinn, John P. ;
Dobson, Richard J. B. ;
Swanson, Chad M. ;
Al-Chalabi, Ammar ;
Iacoangeli, Alfredo .
FRONTIERS IN BIOINFORMATICS, 2023, 2
[8]   Straglr: discovering and genotyping tandem repeat expansions using whole genome long-read sequences [J].
Chiu, Readman ;
Rajan-Babu, Indhu-Shree ;
Friedman, Jan M. ;
Birol, Inanc .
GENOME BIOLOGY, 2021, 22 (01)
[9]  
2015, bioRxiv, DOI [10.1101/023754, 10.1101/023754, DOI 10.1101/023754]
[10]   Joint Variant and De Novo Mutation Identification on Pedigrees from High-Throughput Sequencing Data [J].
Cleary, John G. ;
Braithwaite, Ross ;
Gaastra, Kurt ;
Hilbush, Brian S. ;
Inglis, Stuart ;
Irvine, Sean A. ;
Jackson, Alan ;
Littin, Richard ;
Nohzadeh-Malakshah, Sahar ;
Rathod, Mehul ;
Ware, David ;
Trigg, Len ;
De La Vega, Francisco M. .
JOURNAL OF COMPUTATIONAL BIOLOGY, 2014, 21 (06) :405-419