SiCloneFit: Bayesian inference of population structure, genotype, and phylogeny of tumor clones from single-cell genome sequencing data

被引:68
|
作者
Zafar, Hamim [1 ,2 ]
Navin, Nicholas [3 ]
Chen, Ken [2 ]
Nakhleh, Luay [1 ]
机构
[1] Rice Univ, Dept Comp Sci, Houston, TX 77005 USA
[2] Univ Texas MD Anderson Canc Ctr, Dept Bioinformat & Computat Biol, Houston, TX 77030 USA
[3] Univ Texas MD Anderson Canc Ctr, Dept Genet, Houston, TX 77030 USA
基金
美国国家科学基金会;
关键词
INTRATUMOR HETEROGENEITY; CANCER; EVOLUTION; SELECTION; HISTORY; MODEL;
D O I
10.1101/gr.243121.118
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Accumulation and selection of somatic mutations in a Darwinian framework result in intra-tumor heterogeneity (ITH) that poses significant challenges to the diagnosis and clinical therapy of cancer. Identification of the tumor cell populations (clones) and reconstruction of their evolutionary relationship can elucidate this heterogeneity. Recently developed single-cell DNA sequencing (SCS) technologies promise to resolve ITH to a single-cell level. However, technical errors in SCS data sets, including false-positives (FP) and false-negatives (FN) due to allelic dropout, and cell doublets, significantly complicate these tasks. Here, we propose a nonparametric Bayesian method that reconstructs the clonal populations as clusters of single cells, genotypes of each clone, and the evolutionary relationship between the clones. It employs a tree-structured Chinese restaurant process as the prior on the number and composition of clonal populations. The evolution of the clonal populations is modeled by a clonal phylogeny and a finite-site model of evolution to account for potential mutation recurrence and losses. We probabilistically account for FP and FN errors, and cell doublets are modeled by employing a Beta-binomial distribution. We develop a Gibbs sampling algorithm comprising partial reversible-jump and partial Metropolis-Hastings updates to explore the joint posterior space of all parameters. The performance of our method on synthetic and experimental data sets suggests that joint reconstruction of tumor clones and clonal phylogeny under a finite-site model of evolution leads to more accurate inferences. Our method is the first to enable this joint reconstruction in a fully Bayesian framework, thus providing measures of support of the inferences it makes.
引用
收藏
页码:1847 / 1859
页数:13
相关论文
共 50 条
  • [31] Somatic variant calling from single-cell DNA sequencing data
    Valecha, Monica
    Posada, David
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2022, 20 : 2978 - 2985
  • [32] Identification of Somatic Mutations From Bulk and Single-Cell Sequencing Data
    Huang, August Yue
    Lee, Eunjung Alice
    FRONTIERS IN AGING, 2022, 2
  • [33] Ploidy inference from single-cell data: application to human and mouse cell atlases
    Takeuchi, Fumihiko
    Kato, Norihiro
    GENETICS, 2024, 227 (02)
  • [34] StructHDP: automatic inference of number of clusters and population structure from admixed genotype data
    Shringarpure, Suyash
    Won, Daegun
    Xing, Eric P.
    BIOINFORMATICS, 2011, 27 (13) : I324 - I332
  • [35] LACE: Inference of cancer evolution models from longitudinal single-cell data
    Ramazzotti, Daniele
    Angaroni, Fabrizio
    Maspero, Davide
    Ascolani, Gianluca
    Castiglioni, Isabella
    Piazza, Rocco
    Antoniotti, Marco
    Graudenzi, Alex
    JOURNAL OF COMPUTATIONAL SCIENCE, 2022, 58
  • [36] Inference of single-cell phylogenies from lineage tracing data using Cassiopeia
    Jones, Matthew G.
    Khodaverdian, Alex
    Quinn, Jeffrey J.
    Chan, Michelle M.
    Hussmann, Jeffrey A.
    Wang, Robert
    Xu, Chenling
    Weissman, Jonathan S.
    Yosef, Nir
    GENOME BIOLOGY, 2020, 21 (01)
  • [37] SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models
    Hamim Zafar
    Anthony Tzen
    Nicholas Navin
    Ken Chen
    Luay Nakhleh
    Genome Biology, 18
  • [38] SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models
    Zafar, Hamim
    Tzen, Anthony
    Navin, Nicholas
    Chen, Ken
    Nakhleh, Luay
    GENOME BIOLOGY, 2017, 18
  • [39] Bayesian inference of chromatin structure ensembles from population-averaged contact data
    Carstens, Simeon
    Nilges, Michael
    Habeck, Michael
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2020, 117 (14) : 7824 - 7830
  • [40] Analyze the Diversity and Function of Immune Cells in the Tumor Microenvironment From the Perspective of Single-Cell RNA Sequencing
    Ma, Lujuan
    Luan, Yu
    Lu, Lin
    CANCER MEDICINE, 2025, 14 (05):