Screen technical noise in single cell RNA sequencing data

被引:7
作者
Bai, Yu-Long [1 ]
Baddoo, Melody [2 ]
Flemington, Erik K. [2 ]
Nakhoul, Hani N. [2 ]
Liu, Yao-Zhong [1 ]
机构
[1] Tulane Univ, Sch Publ Hlth & Trop Med, Dept Global Biostat & Data Sci, 1440 Canal St,Suite 1610, New Orleans, LA 70112 USA
[2] Tulane Univ, Hlth Sci Ctr, Tulane Canc Ctr, Dept Pathol, New Orleans, LA 70118 USA
基金
美国国家科学基金会;
关键词
Single cell RNA-seq; Next generation sequencing; QC; Housekeeping genes; SCQC; ESTABLISHMENT;
D O I
10.1016/j.ygeno.2019.02.014
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We proposed a data cleaning pipeline for single cell (SC) RNA-seq data, where we first screen genes (gene-wise screening) followed by screening cell libraries (library-wise screening). Gene-wise screening is based on the expectation that for a gene with a low technical noise, a gene's count in a library will tend to increase with the increase of library size, which was tested using negative binomial regression of gene count (as dependent variable) against library size (as independent variable). Library-wise screening is based on the expectation that across-library correlations for housekeeping (HK) genes is expected to be higher than the correlations for nonhousekeeping (NHK) genes in those libraries with low technical noise. We removed those libraries, whose mean pairwise correlation for HK genes is NOT significantly higher than that for NHK genes. We successfully applied the pipeline to two large SC RNA-seq datasets. The pipeline was also developed into an R package.
引用
收藏
页码:346 / 355
页数:10
相关论文
共 18 条
[1]   Modeling Enzyme Processivity Reveals that RNA-Seq Libraries Are Biased in Characteristic and Correctable Ways [J].
Archer, Nathan ;
Walsh, Mark D. ;
Shahrezaei, Vahid ;
Hebenstreit, Daniel .
CELL SYSTEMS, 2016, 3 (05) :467-+
[2]  
Brennecke P, 2013, NAT METHODS, V10, P1093, DOI [10.1038/NMETH.2645, 10.1038/nmeth.2645]
[3]   STAR: ultrafast universal RNA-seq aligner [J].
Dobin, Alexander ;
Davis, Carrie A. ;
Schlesinger, Felix ;
Drenkow, Jorg ;
Zaleski, Chris ;
Jha, Sonali ;
Batut, Philippe ;
Chaisson, Mark ;
Gingeras, Thomas R. .
BIOINFORMATICS, 2013, 29 (01) :15-21
[4]   Human housekeeping genes, revisited [J].
Eisenberg, Eli ;
Levanon, Erez Y. .
TRENDS IN GENETICS, 2013, 29 (10) :569-574
[5]  
Kharchenko PV, 2014, NAT METHODS, V11, P740, DOI [10.1038/NMETH.2967, 10.1038/nmeth.2967]
[6]   Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression [J].
Kim, Jong Kyoung ;
Kolodziejczyk, Aleksandra A. ;
Illicic, Tomislav ;
Teichmann, Sarah A. ;
Marioni, John C. .
NATURE COMMUNICATIONS, 2015, 6
[7]   Establishment and characterization of six human gastric carcinoma cell lines, including one naturally infected with Epstein-Barr virus [J].
Ku, Ja-Lok ;
Kim, Kyung-Hee ;
Choi, Jin-Sung ;
Kim, Sung-Hee ;
Shin, Young-Kyoung ;
Chang, Hee Jin ;
Bae, Jae-Moon ;
Kim, Young-Woo ;
Lee, Jun Ho ;
Yang, Han-Kwang ;
Kim, Woo-Ho ;
Jeong, Seung-Yong ;
Park, Jae-Gahb .
CELLULAR ONCOLOGY, 2012, 35 (02) :127-136
[8]   Advances in understanding tumour evolution through single-cell sequencing [J].
Kuipers, Jack ;
Jahn, Katharina ;
Beerenwinkel, Niko .
BIOCHIMICA ET BIOPHYSICA ACTA-REVIEWS ON CANCER, 2017, 1867 (02) :127-138
[9]   Single-cell sequencing [J].
Nawy, Tal .
NATURE METHODS, 2014, 11 (01) :18-18
[10]  
Park JG, 1997, INT J CANCER, V70, P443, DOI 10.1002/(SICI)1097-0215(19970207)70:4<443::AID-IJC12>3.0.CO