netNMF-sc: leveraging gene-gene interactions for imputation and dimensionality reduction in single-cell expression analysis

被引:63
|
作者
Elyanow, Rebecca [1 ,2 ]
Dumitrascu, Bianca [3 ,5 ,6 ]
Engelhardt, Barbara E. [2 ,4 ,7 ]
Raphael, Benjamin J. [2 ]
机构
[1] Brown Univ, Ctr Computat Mol Biol, Providence, RI 02912 USA
[2] Princeton Univ, Dept Comp Sci, Princeton, NJ 08540 USA
[3] Princeton Univ, Lewis Sigler Inst Integrat Genom, Princeton, NJ 08540 USA
[4] Princeton Univ, Ctr Stat & Machine Learning, Princeton, NJ 08540 USA
[5] Duke Univ, SAMSI, Durham, NC 27706 USA
[6] Duke Univ, Dept Stat Sci, Durham, NC 27706 USA
[7] Genomics Plc, Oxford, England
基金
美国国家卫生研究院; 美国国家科学基金会;
关键词
RNA-SEQ; REVEALS; TRANSCRIPTOMICS; MICROARRAY; DATABASE;
D O I
10.1101/gr.251603.119
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Single-cell RNA-sequencing (scRNA-seq) enables high-throughput measurement of RNA expression in single cells. However, because of technical limitations, scRNA-seq data often contain zero counts for many transcripts in individual cells. These zero counts, or dropout events, complicate the analysis of scRNA-seq data using standard methods developed for bulk RNA-seq data. Current scRNA-seq analysis methods typically overcome dropout by combining information across cells in a lower-dimensional space, leveraging the observation that cells generally occupy a small number of RNA expression states. We introduce netNMF-sc, an algorithm for scRNA-seq analysis that leverages information across both cells and genes. netNMF-sc learns a low-dimensional representation of scRNA-seq transcript counts using network-regularized non-negative matrix factorization. The network regularization takes advantage of prior knowledge of gene-gene interactions, encouraging pairs of genes with known interactions to be nearby each other in the low-dimensional representation. The resulting matrix factorization imputes gene abundance for both zero and nonzero counts and can be used to cluster cells into meaningful subpopulations. We show that netNMF-sc outperforms existing methods at clustering cells and estimating gene-gene covariance using both simulated and real scRNA-seq data, with increasing advantages at higher dropout rates (e.g., >60%). We also show that the results from netNMF-sc are robust to variation in the input network, with more representative networks leading to greater performance gains.
引用
收藏
页码:195 / 204
页数:10
相关论文
共 50 条
  • [41] SCDC: bulk gene expression deconvolution by multiple single-cell RNA sequencing references
    Dong, Meichen
    Thennavan, Aatish
    Urrutia, Eugene
    Li, Yun
    Perou, Charles M.
    Zou, Fei
    Jiang, Yuchao
    BRIEFINGS IN BIOINFORMATICS, 2021, 22 (01) : 416 - 427
  • [42] Quantitative single-cell gene expression measurements of multiple genes in response to hypoxia treatment
    Zeng, Jia
    Wang, Jiangxin
    Gao, Weimin
    Mohammadreza, Aida
    Kelbauskas, Laimonas
    Zhang, Weiwen
    Johnson, Roger H.
    Meldrum, Deirdre R.
    ANALYTICAL AND BIOANALYTICAL CHEMISTRY, 2011, 401 (01) : 3 - 13
  • [43] Pan-Cancer and Single-Cell Modeling of Genomic Alterations Through Gene Expression
    Mercatelli, Daniele
    Ray, Forest
    Giorgi, Federico M.
    FRONTIERS IN GENETICS, 2019, 10
  • [44] Dynamic single-cell measurements of gene expression in primary lymphocytes: challenges, tools and prospects
    Polonsky, Michal
    Zaretsky, Irina
    Friedman, Nir
    BRIEFINGS IN FUNCTIONAL GENOMICS, 2013, 12 (02) : 99 - 108
  • [45] Discovering a Four-Gene Prognostic Model Based on Single-Cell Data and Gene Expression Data of Pancreatic Adenocarcinoma
    Huang, Weizhen
    Li, Jun
    Zhou, Siwei
    Li, Yi
    Yuan, Xia
    FRONTIERS IN ENDOCRINOLOGY, 2022, 13
  • [46] GiniClust: detecting rare cell types from single-cell gene expression data with Gini index
    Lan Jiang
    Huidong Chen
    Luca Pinello
    Guo-Cheng Yuan
    Genome Biology, 17
  • [47] GiniClust: detecting rare cell types from single-cell gene expression data with Gini index
    Jiang, Lan
    Chen, Huidong
    Pinello, Luca
    Yuan, Guo-Cheng
    GENOME BIOLOGY, 2016, 17
  • [48] Gut mucosa dissociation protocols influence cell type proportions and single-cell gene expression levels
    Venema, Werna T. C. Uniken
    Ramirez-Sanchez, Aaron D.
    Bigaeva, Emilia
    Withoff, Sebo
    Jonkers, Iris
    McIntyre, Rebecca E.
    Ghouraba, Mennatallah
    Raine, Tim
    Weersma, Rinse K.
    Franke, Lude
    Festen, Eleonora A. M.
    van der Wijst, Monique G. P.
    SCIENTIFIC REPORTS, 2022, 12 (01)
  • [49] G2S3: A gene graph-based imputation method for single-cell RNA sequencing data
    Wu, Weimiao
    Liu, Yunqing
    Dai, Qile
    Yan, Xiting
    Wang, Zuoheng
    PLOS COMPUTATIONAL BIOLOGY, 2021, 17 (05)
  • [50] Dirichlet Process Mixture Model for Correcting Technical Variation in Single-Cell Gene Expression Data
    Prabhakaran, Sandhya
    Azizi, Elham
    Carr, Ambrose
    Pe'er, Dana
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48