SNV identification from single-cell RNA sequencing data

被引:19
作者
Schnepp, Patricia M.
Chen, Mengjie
Keller, Evan T.
Zhou, Xiang
机构
[1] Department of Urology, University of Michigan, Medical School, Ann Arbor, 48109, MI
[2] Department of Medicine, University of Chicago, Chicago, IL
[3] Biointerfaces Institute, University of Michigan, Medical School, Ann Arbor, MI
[4] Department of Biostatistics, University of Michigan, Medical School, Ann Arbor, MI
[5] Center for Statistical Genetics, University of Michigan, Medical School, Ann Arbor, MI
基金
美国国家卫生研究院;
关键词
GENE-EXPRESSION; UNDERSTANDING MECHANISMS; CANCER; PLURIPOTENT; HETEROGENEITY; NUCLEOTIDE; STATE;
D O I
10.1093/hmg/ddz207
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
Integrating single-cell RNA sequencing (scRNA-seq) data with genotypes obtained from DNA sequencing studies facilitates the detection of functional genetic variants underlying cell type-specific gene expression variation. Unfortunately, most existing scRNA-seq studies do not come with DNA sequencing data; thus, being able to call single nucleotide variants (SNVs) from scRNA-seq data alone can provide crucial and complementary information, detection of functional SNVs, maximizing the potential of existing scRNA-seq studies. Here, we perform extensive analyses to evaluate the utility of two SNV calling pipelines (GATK and Monovar), originally designed for SNV calling in either bulk or single-cell DNA sequencing data. In both pipelines, we examined various parameter settings to determine the accuracy of the final SNV call set and provide practical recommendations for applied analysts. We found that combining all reads from the single cells and following GATK Best Practices resulted in the highest number of SNVs identified with a high concordance. In individual single cells, Monovar resulted in better quality SNVs even though none of the pipelines analyzed is capable of calling a reasonable number of SNVs with high accuracy. In addition, we found that SNV calling quality varies across different functional genomic regions. Our results open doors for novel ways to leverage the use of scRNA-seq for the future investigation of SNV function.
引用
收藏
页码:3569 / 3583
页数:21
相关论文
共 41 条
[1]   A global reference for human genetic variation [J].
Altshuler, David M. ;
Durbin, Richard M. ;
Abecasis, Goncalo R. ;
Bentley, David R. ;
Chakravarti, Aravinda ;
Clark, Andrew G. ;
Donnelly, Peter ;
Eichler, Evan E. ;
Flicek, Paul ;
Gabriel, Stacey B. ;
Gibbs, Richard A. ;
Green, Eric D. ;
Hurles, Matthew E. ;
Knoppers, Bartha M. ;
Korbel, Jan O. ;
Lander, Eric S. ;
Lee, Charles ;
Lehrach, Hans ;
Mardis, Elaine R. ;
Marth, Gabor T. ;
McVean, Gil A. ;
Nickerson, Deborah A. ;
Wang, Jun ;
Wilson, Richard K. ;
Boerwinkle, Eric ;
Doddapaneni, Harsha ;
Han, Yi ;
Korchina, Viktoriya ;
Kovar, Christie ;
Lee, Sandra ;
Muzny, Donna ;
Reid, Jeffrey G. ;
Zhu, Yiming ;
Chang, Yuqi ;
Feng, Qiang ;
Fang, Xiaodong ;
Guo, Xiaosen ;
Jian, Min ;
Jiang, Hui ;
Jin, Xin ;
Lan, Tianming ;
Li, Guoqing ;
Li, Jingxiang ;
Li, Yingrui ;
Liu, Shengmao ;
Liu, Xiao ;
Lu, Yao ;
Ma, Xuedi ;
Tang, Meifang ;
Wang, Bo .
NATURE, 2015, 526 (7571) :68-+
[2]   Glioma stem cells promote radioresistance by preferential activation of the DNA damage response [J].
Bao, Shideng ;
Wu, Qiulian ;
McLendon, Roger E. ;
Hao, Yueling ;
Shi, Qing ;
Hjelmeland, Anita B. ;
Dewhirst, Mark W. ;
Bigner, Darell D. ;
Rich, Jeremy N. .
NATURE, 2006, 444 (7120) :756-760
[3]   Computational solutions for omics data [J].
Berger, Bonnie ;
Peng, Jian ;
Singh, Mona .
NATURE REVIEWS GENETICS, 2013, 14 (05) :333-346
[4]   Biased Allelic Expression in Human Primary Fibroblast Single Cells [J].
Borel, Christelle ;
Ferreira, Pedro G. ;
Santoni, Federico ;
Delaneau, Olivier ;
Fort, Alexandre ;
Popadin, Konstantin Y. ;
Garieri, Marco ;
Falconnet, Emilie ;
Ribaux, Pascale ;
Guipponi, Michel ;
Padioleau, Ismael ;
Carninci, Piero ;
Dermitzakis, Emmanouil T. ;
Antonarakis, Stylianos E. .
AMERICAN JOURNAL OF HUMAN GENETICS, 2015, 96 (01) :70-80
[5]   Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells [J].
Buettner, Florian ;
Natarajan, Kedar N. ;
Casale, F. Paolo ;
Proserpio, Valentina ;
Scialdone, Antonio ;
Theis, Fabian J. ;
Teichmann, Sarah A. ;
Marioni, John C. ;
Stegie, Oliver .
NATURE BIOTECHNOLOGY, 2015, 33 (02) :155-160
[6]   Regulatory element detection using correlation with expression [J].
Bussemaker, HJ ;
Li, H ;
Siggia, ED .
NATURE GENETICS, 2001, 27 (02) :167-171
[7]   Embryonic stem cells and somatic cells differ in mutation frequency and type [J].
Cervantes, RB ;
Stringer, JR ;
Shao, CS ;
Tischfield, JA ;
Stambrook, PJ .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2002, 99 (06) :3586-3590
[8]   Use of transcriptomics in understanding mechanisms of drug-induced toxicity [J].
Cui, Yuxia ;
Paules, Richard S. .
PHARMACOGENOMICS, 2010, 11 (04) :573-585
[9]   Gene Expression Omnibus: NCBI gene expression and hybridization array data repository [J].
Edgar, R ;
Domrachev, M ;
Lash, AE .
NUCLEIC ACIDS RESEARCH, 2002, 30 (01) :207-210
[10]   Single-cell genome sequencing: current state of the science [J].
Gawad, Charles ;
Koh, Winston ;
Quake, Stephen R. .
NATURE REVIEWS GENETICS, 2016, 17 (03) :175-188