A novel statistical method for decontaminating T-cell receptor sequencing data

被引:1
|
作者
Li, Ruoxing [1 ]
Altan, Mehmet [2 ]
Reuben, Alexandre [2 ]
Lin, Ruitao [8 ]
Heymach, John, V [3 ]
Tran, Hai [2 ]
Chen, Runzhe [4 ]
Little, Latasha [5 ]
Hubert, Shawna [6 ]
Zhang, Jianjun [7 ,9 ,10 ,11 ]
Li, Ziyi [8 ,12 ]
机构
[1] Univ Texas Hlth Sci Ctr Houston, Dept Biostat & Data Sci, Houston, TX USA
[2] Univ Texas MD Anderson Canc Ctr, Dept Thorac Head & Neck Med Oncol, Houston, TX USA
[3] Univ Texas MD Anderson Canc Ctr, Chair Thorac Head & Neck Med Oncol, Houston, TX USA
[4] Univ Texas MD Anderson Canc Ctr, Houston, TX USA
[5] Univ Texas MD Anderson Canc Ctr, Dept Thorac Head & Neck Med Oncol, Houston, TX USA
[6] Univ Texas MD Anderson Canc Ctr, Chair Thorac Head & Neck Med Oncol, Houston, TX USA
[7] Univ Texas MD Anderson Canc Ctr, Dept Thorac Head & Neck Med Oncol, Houston, TX USA
[8] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX USA
[9] Univ Texas MD Anderson Canc Ctr, Lung Canc Genom Program, Houston, TX 77030 USA
[10] Univ Texas MD Anderson Canc Ctr, Lung Canc Intercept Program, Houston, TX 77030 USA
[11] Univ Texas MD Anderson Canc Ctr, Dept Thorac Head & Neck Med Oncol, Houston, TX 77030 USA
[12] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
关键词
Bayesian model; Contamination detection; TCR sequencing; TCR REPERTOIRE;
D O I
10.1093/bib/bbad230
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The T-cell receptor (TCR) repertoire is highly diverse among the population and plays an essential role in initiating multiple immune processes. TCR sequencing (TCR-seq) has been developed to profile the T cell repertoire. Similar to other high-throughput experiments, contamination can happen during several steps of TCR-seq, including sample collection, preparation and sequencing. Such contamination creates artifacts in the data, leading to inaccurate or even biased results. Most existing methods assume 'clean' TCR-seq data as the starting point with no ability to handle data contamination. Here, we develop a novel statistical model to systematically detect and remove contamination in TCR-seq data. We summarize the observed contamination into two sources, pairwise and cross-cohort. For both sources, we provide visualizations and summary statistics to help users assess the severity of the contamination. Incorporating prior information from 14 existing TCR-seq datasets with minimum contamination, we develop a straightforward Bayesian model to statistically identify contaminated samples. We further provide strategies for removing the impacted sequences to allow for downstream analysis, thus avoiding any need to repeat experiments. Our proposed model shows robustness in contamination detection compared with a few off-the-shelf detection methods in simulation studies. We illustrate the use of our proposed method on two TCR-seq datasets generated locally.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Peripheral T-Cell Receptor Repertoire Profiling in Non-small Cell Lung Cancer Using an Amplicon-Based Sequencing Assay
    Wan, Z. Y.
    Hee, Y. T.
    Kaur, H.
    Pinweha, P.
    Choudhury, Y.
    Tan, M. -H.
    JOURNAL OF THORACIC ONCOLOGY, 2022, 17 (09) : S569 - S569
  • [22] Treatment response and outcome of children with T-cell acute lymphoblastic leukemia expressing the gamma-delta T-cell receptor
    Pui, Ching-Hon
    Pei, Deqing
    Cheng, Cheng
    Tomchuck, Suzanne L.
    Evans, Scarlett N.
    Inaba, Hiroto
    Jeha, Sima
    Raimondi, Susana C.
    Choi, John K.
    Thomas, Paul G.
    Dallas, Mari Hashitate
    ONCOIMMUNOLOGY, 2019, 8 (08):
  • [23] T-cell receptor sequencing reveals selected donor-reactive CD8+ T cell clones resist antithymocyte globulin depletion after kidney transplantation
    Ningoo, Mehek
    Cruz-Encarnacion, Pamela
    Khilnani, Calla
    Heeger, Peter S.
    Fribourg, Miguel
    AMERICAN JOURNAL OF TRANSPLANTATION, 2024, 24 (05) : 755 - 764
  • [24] High-throughput and single-cell T cell receptor sequencing technologies
    Pai, Joy A.
    Satpathy, Ansuman T.
    NATURE METHODS, 2021, 18 (08) : 881 - 892
  • [25] Extraction and characterization of the rhesus macaque T-cell receptor β-chain genes
    Greenaway, Hui Yee
    Kurniawan, Monica
    Price, David A.
    Douek, Daniel C.
    Davenport, Miles P.
    Venturi, Vanessa
    IMMUNOLOGY AND CELL BIOLOGY, 2009, 87 (07) : 546 - 553
  • [26] TCRpred: incorporating T-cell receptor repertoire for clinical outcome prediction
    Liu, Meiling
    Liu, Yang
    Hsu, Li
    He, Qianchuan
    FRONTIERS IN GENETICS, 2024, 15
  • [27] Study of the T-cell receptor repertoire by CDR3 spectratyping
    Fozza, Claudio
    Barraqueddu, Francesca
    Corda, Giovanna
    Contini, Salvatore
    Virdis, Patrizia
    Dore, Fausto
    Bonfigli, Silvana
    Longinotti, Maurizio
    JOURNAL OF IMMUNOLOGICAL METHODS, 2017, 440 : 1 - 11
  • [28] Identification of errors introduced during high throughput sequencing of the T cell receptor repertoire
    Phuong Nguyen
    Ma, Jing
    Pei, Deqing
    Obert, Caroline
    Cheng, Cheng
    Geiger, Terrence L.
    BMC GENOMICS, 2011, 12
  • [29] Characterization of the T-cell receptor repertoire associated with lymph node metastasis in colorectal cancer
    Zhen, Ya'nan
    Wang, Hong
    Jiang, Runze
    Wang, Fang
    Chen, Cunbao
    Xu, Zhongfa
    Xiao, Ruixue
    FRONTIERS IN ONCOLOGY, 2024, 14
  • [30] T-cell receptor repertoires as potential diagnostic markers for patients with COVID-19
    Hou, Xianliang
    Wang, Guangyu
    Fan, Wentao
    Chen, Xiaoyan
    Mo, Chune
    Wang, Yongsi
    Gong, Weiwei
    Wen, Xuyan
    Chen, Hui
    He, Dan
    Mo, Lijun
    Jiang, Shaofeng
    Ou, Minglin
    Guo, Haonan
    Liu, Hongbo
    INTERNATIONAL JOURNAL OF INFECTIOUS DISEASES, 2021, 113 : 308 - 317