RUV-III-NB: normalization of single cell RNA-seq data

被引:5
|
作者
Salim, Agus [1 ,2 ,3 ,4 ,5 ]
Molania, Ramyar [2 ]
Wang, Jianan [2 ,6 ]
De Livera, Alysha [1 ,2 ,4 ,5 ,7 ]
Thijssen, Rachel [8 ]
Speed, Terence P. [2 ,3 ]
机构
[1] Univ Melbourne, Melbourne Sch Populat & Global Hlth, Melbourne, Vic 3053, Australia
[2] Walter & Eliza Hall Inst Med Res, Bioinformat Div, Parkville, Vic 3052, Australia
[3] Univ Melbourne, Sch Math & Stat, Melbourne, Vic 3010, Australia
[4] Baker Heart & Diabet Inst, Melbourne, Vic 3004, Australia
[5] La Trobe Univ, Dept Math & Stat, Bundoora, Vic 3086, Australia
[6] Univ Melbourne, Dept Med Biol, Melbourne, Vic 3010, Australia
[7] RMIT Univ, Sch Sci, Melbourne, Vic 3000, Australia
[8] Walter & Eliza Hall Inst Med Res, Blood Cells & Blood Canc Div, Parkville, Vic 3052, Australia
基金
英国医学研究理事会;
关键词
UNWANTED VARIATION; SEQUENCING DATA; EXPRESSION;
D O I
10.1093/nar/gkac486
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Normalization of single cell RNA-seq data remains a challenging task. The performance of different methods can vary greatly between datasets when unwanted factors and biology are associated. Most normalization methods also only remove the effects of unwanted variation for the cell embedding but not from gene-level data typically used for differential expression (DE) analysis to identify marker genes. We propose RUV-III-NB, a method that can be used to remove unwanted variation from both the cell embedding and gene-level counts. Using pseudo-replicates, RUV-III-NB explicitly takes into account potential association with biology when removing unwanted variation. The method can be used for both UMI or read counts and returns adjusted counts that can be used for downstream analyses such as clustering, DE and pseudotime analyses. Using published datasets with different technological platforms, kinds of biology and levels of association between biology and unwanted variation, we show that RUV-III-NB manages to remove library size and batch effects, strengthen biological signals, improve DE analyses, and lead to results exhibiting greater concordance with independent datasets of the same kind. The performance of RUV-III-NB is consistent and is not sensitive to the number of factors assumed to contribute to the unwanted variation.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression
    Christoph Hafemeister
    Rahul Satija
    Genome Biology, 20
  • [32] Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression
    Hafemeister, Christoph
    Satija, Rahul
    GENOME BIOLOGY, 2019, 20 (01)
  • [33] Non-linear Normalization for Non-UMI Single Cell RNA-Seq
    Wu, Zhijin
    Su, Kenong
    Wu, Hao
    FRONTIERS IN GENETICS, 2021, 12
  • [34] A graph-based algorithm for RNA-seq data normalization
    Diem-Trang Tran
    Bhaskara, Aditya
    Kuberan, Balagurunathan
    Might, Matthew
    PLOS ONE, 2020, 15 (01):
  • [35] Transcriptome size matters for single-cell RNA-seq normalization and bulk deconvolution
    Lu, Songjian
    Yang, Jiyuan
    Yan, Lei
    Liu, Jingjing
    Wang, Judy Jiaru
    Jain, Rhea
    Yu, Jiyang
    NATURE COMMUNICATIONS, 2025, 16 (01)
  • [36] Bubble: a fast single-cell RNA-seq imputation using an autoencoder constrained by bulk RNA-seq data
    Chen, Siqi
    Yan, Xuhua
    Zheng, Ruiqing
    Li, Min
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (01)
  • [37] Evaluating imputation methods for single-cell RNA-seq data
    Yi Cheng
    Xiuli Ma
    Lang Yuan
    Zhaoguo Sun
    Pingzhang Wang
    BMC Bioinformatics, 24
  • [38] Analysis of Single-Cell RNA-seq Data by Clustering Approaches
    Zhu, Xiaoshu
    Li, Hong-Dong
    Guo, Lilu
    Wu, Fang-Xiang
    Wang, Jianxin
    CURRENT BIOINFORMATICS, 2019, 14 (04) : 314 - 322
  • [39] The contribution of cell cycle to heterogeneity in single-cell RNA-seq data
    McDavid, Andrew
    Finak, Greg
    Gottardo, Raphael
    NATURE BIOTECHNOLOGY, 2016, 34 (06) : 591 - 593
  • [40] The contribution of cell cycle to heterogeneity in single-cell RNA-seq data
    Andrew McDavid
    Greg Finak
    Raphael Gottardo
    Nature Biotechnology, 2016, 34 : 591 - 593