A Comparison for Dimensionality Reduction Methods of Single-Cell RNA-seq Data

被引:67
|
作者
Xiang, Ruizhi [1 ]
Wang, Wencan [2 ,3 ]
Yang, Lei [1 ]
Wang, Shiyuan [1 ]
Xu, Chaohan [1 ]
Chen, Xiaowen [1 ]
机构
[1] Harbin Med Univ, Coll Bioinformat Sci & Technol, Harbin, Peoples R China
[2] Wenzhou Med Univ, Sch Optometry & Ophthalmol, Wenzhou, Peoples R China
[3] Wenzhou Med Univ, Eye Hosp, Wenzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
single-cell RNA-seq; dimension reduction; benchmark; sequences analysis; deep learning; GENE-EXPRESSION;
D O I
10.3389/fgene.2021.646936
中图分类号
Q3 [遗传学];
学科分类号
071007 ; 090102 ;
摘要
Single-cell RNA sequencing (scRNA-seq) is a high-throughput sequencing technology performed at the level of an individual cell, which can have a potential to understand cellular heterogeneity. However, scRNA-seq data are high-dimensional, noisy, and sparse data. Dimension reduction is an important step in downstream analysis of scRNA-seq. Therefore, several dimension reduction methods have been developed. We developed a strategy to evaluate the stability, accuracy, and computing cost of 10 dimensionality reduction methods using 30 simulation datasets and five real datasets. Additionally, we investigated the sensitivity of all the methods to hyperparameter tuning and gave users appropriate suggestions. We found that t-distributed stochastic neighbor embedding (t-SNE) yielded the best overall performance with the highest accuracy and computing cost. Meanwhile, uniform manifold approximation and projection (UMAP) exhibited the highest stability, as well as moderate accuracy and the second highest computing cost. UMAP well preserves the original cohesion and separation of cell populations. In addition, it is worth noting that users need to set the hyperparameters according to the specific situation before using the dimensionality reduction methods based on non-linear model and neural network.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Identifying Lung Cancer Cell Markers with Machine Learning Methods and Single-Cell RNA-Seq Data
    Huang, Guo-Hua
    Zhang, Yu-Hang
    Chen, Lei
    Li, You
    Huang, Tao
    Cai, Yu-Dong
    LIFE-BASEL, 2021, 11 (09):
  • [32] Deep Batch Integration and Denoise of Single-Cell RNA-Seq Data
    Qin, Lu
    Zhang, Guangya
    Zhang, Shaoqiang
    Chen, Yong
    ADVANCED SCIENCE, 2024, 11 (29)
  • [33] Identification of innate lymphoid cells in single-cell RNA-Seq data
    Madeleine Suffiotti
    Santiago J. Carmona
    Camilla Jandus
    David Gfeller
    Immunogenetics, 2017, 69 : 439 - 450
  • [34] VASC: Dimension Reduction and Visualization of Single-cell RNA-seq Data by Deep Variational Autoencoder
    Dongfang Wang
    Jin Gu
    Genomics,Proteomics & Bioinformatics, 2018, (05) : 320 - 331
  • [35] Zero-preserving imputation of single-cell RNA-seq data
    Linderman, George C.
    Zhao, Jun
    Roulis, Manolis
    Bielecki, Piotr
    Flavell, Richard A.
    Nadler, Boaz
    Kluger, Yuval
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [36] A Survey on Methods for Predicting Polyadenylation Sites from DNA Sequences, Bulk RNA-seq, and Single-cell RNA-seq
    Ye, Wenbin
    Lian, Qiwei
    Ye, Congting
    Wu, Xiaohui
    GENOMICS PROTEOMICS & BIOINFORMATICS, 2023, 21 (01) : 67 - 83
  • [37] A Hybrid Clustering Algorithm for Identifying Cell Types from Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Li, Hong-Dong
    Xu, Yunpei
    Guo, Lilu
    Wu, Fang-Xiang
    Duan, Guihua
    Wang, Jianxin
    GENES, 2019, 10 (02)
  • [38] A deep matrix factorization based approach for single-cell RNA-seq data clustering
    Liang, Zhenlan
    Zheng, Ruiqing
    Chen, Siqi
    Yan, Xuhua
    Li, Min
    METHODS, 2022, 205 : 114 - 122
  • [39] A Global Similarity Learning for Clustering of Single-Cell RNA-Seq Data
    Zhu, Xiaoshu
    Guo, Lilu
    Xu, Yunpei
    Li, Hong-Dong
    Liao, Xingyu
    Wu, Fang-Xiang
    Peng, Xiaoqing
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 261 - 266
  • [40] An interpretable framework for clustering single-cell RNA-Seq datasets
    Zhang, Jesse M.
    Fan, Jue
    Fan, Christina
    Rosenfeld, David
    Tse, David N.
    BMC BIOINFORMATICS, 2018, 19