RIA: a novel Regression-based Imputation Approach for single-cell RNA sequencing

被引:6
作者
Bang Tran [1 ]
Duc Tran [1 ]
Hung Nguyen [1 ]
Nam Sy Vo [2 ]
Tin Nguyen [1 ]
机构
[1] Univ Nevada, Comp Sci & Engn, Reno, NV 89557 USA
[2] Vingrp Big Data Inst, Computat Biomed, Hanoi, Vietnam
来源
PROCEEDINGS OF 2019 11TH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE 2019) | 2019年
基金
美国国家航空航天局;
关键词
single cell; scRNA-seq; imputation; sequencing; GENE-EXPRESSION; HETEROGENEITY; EMBRYOS; FATE;
D O I
10.1109/kse.2019.8919334
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Advances in single-cell technologies have shifted genomics research from the analysis of bulk tissues toward a comprehensive characterization of individual cells. This holds enormous opportunities for both basic biology and clinical research. As such, identification and characterization of short-lived progenitors, stem cells, cancer stem cells, or circulating tumor cells are essential to better understand both normal and diseased tissue biology. However, quantifying gene expression in each cell remains a significant challenge due to the low amount of mRNA available within individual cells. This leads to the excess amount of zero counts caused by dropout events. Here we introduce RIA, a regression-based approach, that is able to reliably recover the missing values in single-cell data and thus can effectively improve the performance of downstream analyses. We compare RIA with state-of-the-art methods using five scRNA-seq datasets with a total of 3,535 cells. In each dataset analyzed, RIA outperforms existing approaches in improving the identification of cell populations while preserving the biological landscape. We also demonstrate that RIA is able to infer temporal trajectories of embryonic development stages.
引用
收藏
页码:229 / 237
页数:9
相关论文
共 51 条
[21]   PINSPlus: a tool for tumor subtype discovery in integrated genomic data [J].
Hung Nguyen ;
Shrestha, Sangam ;
Draghici, Sorin ;
Tin Nguyen .
BIOINFORMATICS, 2019, 35 (16) :2843-2846
[22]  
Jaccard P., 1901, Bulletin del la Societe Vaudoise des Sciences Naturelles, V37, P547, DOI [DOI 10.5169/SEALS-266450, 10.5169/seals-266450]
[23]  
Kharchenko PV, 2014, NAT METHODS, V11, P740, DOI [10.1038/NMETH.2967, 10.1038/nmeth.2967]
[24]  
Kiselev VY, 2017, NAT METHODS, V14, P483, DOI [10.1038/NMETH.4236, 10.1038/nmeth.4236]
[25]  
Krijthe J.H., 2015, R package version 010
[26]   An accurate and robust imputation method scImpute for single-cell RNA-seq data [J].
Li, Wei Vivian ;
Li, Jingyi Jessica .
NATURE COMMUNICATIONS, 2018, 9
[27]  
Liu Serena, 2016, F1000Res, V5, DOI 10.12688/f1000research.7223.1
[28]  
Lun Aaron T L, 2016, F1000Res, V5, P2122
[29]   Community assessment to advance computational prediction of cancer drug combinations in a pharmacogenomic screen [J].
Menden, Michael P. ;
Wang, Dennis ;
Mason, Mike J. ;
Szalai, Bence ;
Bulusu, Krishna C. ;
Guan, Yuanfang ;
Yu, Thomas ;
Kang, Jaewoo ;
Jeon, Minji ;
Wolfinger, Russ ;
Nguyen, Tin ;
Zaslavskiy, Mikhail ;
Jang, In Sock ;
Ghazoui, Zara ;
Ahsen, Mehmet Eren ;
Vogel, Robert ;
Neto, Elias Chaibub ;
Norman, Thea ;
Tang, Eric K. Y. ;
Garnett, Mathew J. ;
Di Veroli, Giovanni Y. ;
Fawell, Stephen ;
Stolovitzky, Gustavo ;
Guinney, Justin ;
Dry, Jonathan R. ;
Saez-Rodriguez, Julio ;
Abante, Jordi ;
Abecassis, Barbara Schmitz ;
Aben, Nanne ;
Aghamirzaie, Delasa ;
Aittokallio, Tero ;
Akhtari, Farida S. ;
Al-lazikani, Bissan ;
Alam, Tanvir ;
Allam, Amin ;
Allen, Chad ;
de Almeida, Mariana Pelicano ;
Altarawy, Doaa ;
Alves, Vinicius ;
Amadoz, Alicia ;
Anchang, Benedict ;
Antolin, Albert A. ;
Ash, Jeremy R. ;
Romeo Aznar, Victoria ;
Ba-alawi, Wail ;
Bagheri, Moeen ;
Bajic, Vladimir ;
Ball, Gordon ;
Ballester, Pedro J. ;
Baptista, Delora .
NATURE COMMUNICATIONS, 2019, 10 (1)
[30]   Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data [J].
Monti, S ;
Tamayo, P ;
Mesirov, J ;
Golub, T .
MACHINE LEARNING, 2003, 52 (1-2) :91-118