HMMploidy: inference of ploidy levels from short-read sequencing data

被引:3
作者
Soraggi, Samuele [1 ,2 ]
Rhodes, Johanna [3 ]
Altinkaya, Isin [2 ,4 ,5 ]
Tarrant, Oliver [2 ]
Balloux, Francois [6 ]
Fisher, Matthew C. [3 ]
Fumagalli, Matteo [2 ,7 ]
机构
[1] Univ Aarhus, Bioinformat Res Ctr BiRC, DK-8000 Aarhus, Denmark
[2] Imperial Coll London, Dept Life Sci, Silwood Pk, Ascot SL5 7PY, England
[3] Imperial Coll London, MRC Ctr Global Infect Dis Anal, Dept Infect Dis Epidemiol, London W2 1PG, England
[4] Hacettepe Univ, Dept Biol, Beytepe Campus, TR-06800 Ankara, Turkiye
[5] GLOBE, Sect Geogenet, Oster Voldgade 5-7, DK-1350 Copenhagen, Denmark
[6] Univ Col London, UCL Genet Inst, London WC1 E6BT, England
[7] Queen Mary Univ London, Sch Biol & Behav Sci, London E1 4NS, England
来源
PEER COMMUNITY JOURNAL | 2022年 / 2卷
基金
英国医学研究理事会; 英国惠康基金;
关键词
CRYPTOCOCCAL MENINGITIS; POPULATION; POLYPLOIDY; GENOMICS; FLUCONAZOLE;
D O I
10.24072/pcjournal.178
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The inference of ploidy levels from genomic data is important to understand molecular mechanisms underpinning genome evolution. However, current methods based on allele frequency and sequencing depth variation do not have power to infer ploidy levels at low- and mid-depth sequencing data, as they do not account for data uncertainty. Here we introduce HMMploidy, a novel tool that leverages the information from multiple samples and combines the information from sequencing depth and genotype likelihoods. We demonstrate that HMMploidy outperforms existing methods in most tested scenarios, especially at low-depth with large sample size. We apply HMMploidy to sequencing data from the pathogenic fungus Cryptococcus neoformans and retrieve pervasive patterns of aneuploidy, even when artificially downsampling the sequencing data. We envisage that HMMploidy will have wide applicability to low-depth sequencing data from polyploid and aneuploid species.
引用
收藏
页数:13
相关论文
共 51 条
[1]   Differential expression analysis for sequence count data [J].
Anders, Simon ;
Huber, Wolfgang .
GENOME BIOLOGY, 2010, 11 (10)
[2]  
Avramovska O, 2021, BIORXIV, DOI [10.1101/2021.02.28.433243, DOI 10.1101/2021.02.28.433243]
[3]   AbsCN-seq: a statistical method to estimate tumor purity, ploidy and absolute copy numbers from next-generation sequencing data [J].
Bao, Lei ;
Pu, Minya ;
Messer, Karen .
BIOINFORMATICS, 2014, 30 (08) :1056-1063
[4]   Context is everything: aneuploidy in cancer [J].
Ben-David, Uri ;
Amon, Angelika .
NATURE REVIEWS GENETICS, 2020, 21 (01) :44-62
[5]  
Bishop CM., 2006, PATTERN RECOGNITION, DOI [DOI 10.1007/978-0-387-45528-0, 10.1007/978-0-387-45528-0]
[6]  
CAPPE O, 2005, SPR S STAT, P1
[7]  
Casella G, 2002, Thomson Learning, P660
[8]   Departure from Hardy Weinberg Equilibrium and Genotyping Error [J].
Chen, Bowang ;
Cole, John W. ;
Grond-Ginsbach, Caspar .
FRONTIERS IN GENETICS, 2017, 8
[9]   ploidyNGS: visually exploring ploidy with Next Generation Sequencing data [J].
Correa dos Santos, Renato Augusto ;
Goldman, Gustavo Henrique ;
Riano-Pachon, Diego Mauricio .
BIOINFORMATICS, 2017, 33 (16) :2575-2576
[10]   Size does matter: why polyploid tumor cells are critical drug targets in the war on cancer [J].
Coward, Jermaine ;
Harding, Angus .
FRONTIERS IN ONCOLOGY, 2014, 4