A novel statistical method for decontaminating T-cell receptor sequencing data

被引:1
作者
Li, Ruoxing [1 ]
Altan, Mehmet [2 ]
Reuben, Alexandre [2 ]
Lin, Ruitao [8 ]
Heymach, John, V [3 ]
Tran, Hai [2 ]
Chen, Runzhe [4 ]
Little, Latasha [5 ]
Hubert, Shawna [6 ]
Zhang, Jianjun [7 ,9 ,10 ,11 ]
Li, Ziyi [8 ,12 ]
机构
[1] Univ Texas Hlth Sci Ctr Houston, Dept Biostat & Data Sci, Houston, TX USA
[2] Univ Texas MD Anderson Canc Ctr, Dept Thorac Head & Neck Med Oncol, Houston, TX USA
[3] Univ Texas MD Anderson Canc Ctr, Chair Thorac Head & Neck Med Oncol, Houston, TX USA
[4] Univ Texas MD Anderson Canc Ctr, Houston, TX USA
[5] Univ Texas MD Anderson Canc Ctr, Dept Thorac Head & Neck Med Oncol, Houston, TX USA
[6] Univ Texas MD Anderson Canc Ctr, Chair Thorac Head & Neck Med Oncol, Houston, TX USA
[7] Univ Texas MD Anderson Canc Ctr, Dept Thorac Head & Neck Med Oncol, Houston, TX USA
[8] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX USA
[9] Univ Texas MD Anderson Canc Ctr, Lung Canc Genom Program, Houston, TX 77030 USA
[10] Univ Texas MD Anderson Canc Ctr, Lung Canc Intercept Program, Houston, TX 77030 USA
[11] Univ Texas MD Anderson Canc Ctr, Dept Thorac Head & Neck Med Oncol, Houston, TX 77030 USA
[12] Univ Texas MD Anderson Canc Ctr, Dept Biostat, Houston, TX 77030 USA
关键词
Bayesian model; Contamination detection; TCR sequencing; TCR REPERTOIRE;
D O I
10.1093/bib/bbad230
中图分类号
Q5 [生物化学];
学科分类号
071010 ; 081704 ;
摘要
The T-cell receptor (TCR) repertoire is highly diverse among the population and plays an essential role in initiating multiple immune processes. TCR sequencing (TCR-seq) has been developed to profile the T cell repertoire. Similar to other high-throughput experiments, contamination can happen during several steps of TCR-seq, including sample collection, preparation and sequencing. Such contamination creates artifacts in the data, leading to inaccurate or even biased results. Most existing methods assume 'clean' TCR-seq data as the starting point with no ability to handle data contamination. Here, we develop a novel statistical model to systematically detect and remove contamination in TCR-seq data. We summarize the observed contamination into two sources, pairwise and cross-cohort. For both sources, we provide visualizations and summary statistics to help users assess the severity of the contamination. Incorporating prior information from 14 existing TCR-seq datasets with minimum contamination, we develop a straightforward Bayesian model to statistically identify contaminated samples. We further provide strategies for removing the impacted sequences to allow for downstream analysis, thus avoiding any need to repeat experiments. Our proposed model shows robustness in contamination detection compared with a few off-the-shelf detection methods in simulation studies. We illustrate the use of our proposed method on two TCR-seq datasets generated locally.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Characteristics of T-Cell Receptor Repertoire for Differential Response to Methotrexate Treatment for Rheumatoid Arthritis
    Zhao, Taowa
    Zhang, Qian
    Wen, Qinwen
    Liu, Shuyin
    Niu, Zitong
    Qu, Yang
    Wang, Yiting
    Ding, Qiaojiao
    Wei, Pengyao
    Li, Lin
    Kong, Tong
    Fu, Pan
    Qian, Sihua
    Wang, Kaizhe
    Wu, Xiudi
    Zheng, Jianping
    IMMUNOLOGICAL INVESTIGATIONS, 2024, 53 (07) : 1113 - 1124
  • [32] T-cell receptor sequencing reveals hepatocellular carcinoma immune characteristics according to Barcelona Clinic liver cancer stages within liver tissue and peripheral blood
    Li, Rui
    Wang, Junxiao
    Li, Xiubin
    Liang, Yining
    Jiang, Yiyun
    Zhang, Yuwei
    Xu, Pengfei
    Deng, Ling
    Wang, Zhe
    Sun, Tao
    Wu, Jian
    Xie, Hui
    Wang, Yijin
    CANCER SCIENCE, 2024, 115 (01) : 94 - 108
  • [33] Comment on 'rigorous benchmarking of T cell receptor repertoire profiling methods for cancer RNA sequencing'
    Davydov, Alexey N.
    Bolotin, Dmitry A.
    Poslavsky, Stanislav V.
    Chudakov, Dmitry M.
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (06)
  • [34] Immunosenescence and Autoimmunity: Exploiting the T-Cell Receptor Repertoire to Investigate the Impact of Aging on Multiple Sclerosis
    Amoriello, Roberta
    Mariottini, Alice
    Ballerini, Clara
    FRONTIERS IN IMMUNOLOGY, 2021, 12
  • [35] TCR_Explore: A novel webtool for T cell receptor repertoire analysis
    Mullan, Kerry A.
    Zhang, Justin B.
    Jones, Claerwen M.
    Goh, Shawn J. R.
    Revote, Jerico
    Illing, Patricia T.
    Purcell, Anthony W.
    Gruta, Nicole L. La
    Li, Chen
    Mifsud, Nicole A.
    COMPUTATIONAL AND STRUCTURAL BIOTECHNOLOGY JOURNAL, 2023, 21 : 1272 - 1282
  • [36] Divergent Characteristics of T-Cell Receptor Repertoire Between Essential Hypertension and Aldosterone-Producing Adenoma
    Chang, Che-Mai
    Peng, Kang-Yung
    Chan, Chieh-Kai
    Lin, Yu-Feng
    Liao, Hung-Wei
    Chang, Jan-Gowth
    Wu, Mai-Szu
    Wu, Vin-Cent
    Chang, Wei-Chiao
    FRONTIERS IN IMMUNOLOGY, 2022, 13
  • [37] Characterization of T-cell receptor β chain mRNA expression in IFN-α-responsive chronic myelogenous leukaemia patients
    Shimomura, T
    Fujii, S
    Ezaki, I
    Osato, M
    Fujimoto, K
    Takatsuki, K
    Yamamoto, K
    Kawakita, M
    BRITISH JOURNAL OF HAEMATOLOGY, 1999, 105 (01) : 173 - 180
  • [38] TCR-L: an analysis tool for evaluating the association between the T-cell receptor repertoire and clinical phenotypes
    Meiling Liu
    Juna Goo
    Yang Liu
    Wei Sun
    Michael C. Wu
    Li Hsu
    Qianchuan He
    BMC Bioinformatics, 23
  • [39] Quantification of Inter-Sample Differences in T-Cell Receptor Repertoires Using Sequence-Based Information
    Yokota, Ryo
    Kaminaga, Yuki
    Kobayashi, Tetsuya J.
    FRONTIERS IN IMMUNOLOGY, 2017, 8
  • [40] TCR-L: an analysis tool for evaluating the association between the T-cell receptor repertoire and clinical phenotypes
    Liu, Meiling
    Goo, Juna
    Liu, Yang
    Sun, Wei
    Wu, Michael C.
    Hsu, Li
    He, Qianchuan
    BMC BIOINFORMATICS, 2022, 23 (01)