Hamiltonian Monte Carlo with strict convergence criteria reduces run-to-run variability in forensic DNA mixture deconvolution

被引:5
作者
Susik, Mateusz [1 ,2 ]
Schoenborn, Holger [3 ]
Sbalzarini, Ivo F. [2 ,4 ,5 ]
机构
[1] Biotype GmbH, D-01109 Dresden, Germany
[2] Tech Univ Dresden, Fac Comp Sci, D-01187 Dresden, Germany
[3] qualitype GmbH, D-01109 Dresden, Germany
[4] Max Planck Inst Mol Cell Biol & Genet, D-01307 Dresden, Germany
[5] Ctr Syst Biol Dresden, D-01307 Dresden, Germany
关键词
Probabilistic genotyping; Hamiltonian Monte Carlo; Bayesian inference; Precision; Gelman-Rubin convergence diagnostic; US POPULATION-DATA; LIKELIHOOD RATIOS; SOFTWARE; SINGLE; STRMIX(TM); PROFILES; ISSUE; MIX13; LOCI;
D O I
10.1016/j.fsigen.2022.102744
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Motivation: Analysing mixed DNA profiles is a common task in forensic genetics. Due to the complexity of the data, such analysis is often performed using Markov Chain Monte Carlo (MCMC)-based genotyping algorithms. These trade off precision against execution time. When default settings (including default chain lengths) are used, as large as a 10-fold changes in inferred log-likelihood ratios (LR) are observed when the software is run twice on the same case. So far, this uncertainty has been attributed to the stochasticity of MCMC algorithms. Since LRs translate directly to strength of the evidence in a criminal trial, forensic laboratories desire LR with small run-to-run variability.Results: We present the use of a Hamiltonian Monte Carlo (HMC) algorithm that reduces run-to-run variability in forensic DNA mixture deconvolution by around an order of magnitude without increased runtime. We achieve this by enforcing strict convergence criteria. We show that the choice of convergence metric strongly influences precision. We validate our method by reproducing previously published results for benchmark DNA mixtures (MIX05, MIX13, and ProvedIt). We also present a complete software implementation of our algorithm that is able to leverage GPU acceleration for the inference process. In the benchmark mixtures, on consumer -grade hardware, the runtime is less than 7 min for 3 contributors, less than 35 min for 4 contributors, and less than an hour for 5 contributors with one known contributor.
引用
收藏
页数:9
相关论文
共 44 条
[1]   Commentary: Likelihood Ratio as Weight of Forensic Evidence: A Closer Look [J].
Aitken, Colin ;
Nordgaard, Anders ;
Taroni, Franco ;
Biedermann, Alex .
FRONTIERS IN GENETICS, 2018, 9
[2]   A large-scale dataset of single and mixed-source short tandem repeat profiles to inform human identification strategies: PROVEDIt [J].
Alfonse, Lauren E. ;
Garrett, Amanda D. ;
Lun, Desmond S. ;
Duffy, Ken R. ;
Grgicak, Catherine M. .
FORENSIC SCIENCE INTERNATIONAL-GENETICS, 2018, 32 :62-70
[3]  
[Anonymous], 2016, Exec. Off. Pres. Pres. Counc. Advis. Sci. Technol, V1, P1
[5]   Interpreting low template DNA profiles [J].
Balding, David J. ;
Buckleton, John .
FORENSIC SCIENCE INTERNATIONAL-GENETICS, 2009, 4 (01) :1-10
[6]   DNA PROFILE MATCH PROBABILITY CALCULATION - HOW TO ALLOW FOR POPULATION STRATIFICATION, RELATEDNESS, DATABASE SELECTION AND SINGLE BANDS [J].
BALDING, DJ ;
NICHOLS, RA .
FORENSIC SCIENCE INTERNATIONAL, 1994, 64 (2-3) :125-140
[7]   Validating TrueAllele Interpretation of DNA Mixtures Containing up to Ten Unknown Contributors [J].
Bauer, David W. ;
Butt, Nasir ;
Hornyak, Jennifer M. ;
Perlin, Mark W. .
JOURNAL OF FORENSIC SCIENCES, 2020, 65 (02) :380-398
[8]  
Betancourt M., 2017, A conceptual introduction to hamiltonian Monte Carlo
[9]   EuroForMix: An open source software based on a continuous model to evaluate STR DNA profiles from a mixture of contributors with artefacts [J].
Bleka, Oyvind ;
Storvik, Geir ;
Gill, Peter .
FORENSIC SCIENCE INTERNATIONAL-GENETICS, 2016, 21 :35-44
[10]  
Bright J.-A., 2021, FORENSIC DNA PROFILI