Screen technical noise in single cell RNA sequencing data

被引:6
作者
Bai, Yu-Long [1 ]
Baddoo, Melody [2 ]
Flemington, Erik K. [2 ]
Nakhoul, Hani N. [2 ]
Liu, Yao-Zhong [1 ]
机构
[1] Tulane Univ, Sch Publ Hlth & Trop Med, Dept Global Biostat & Data Sci, 1440 Canal St,Suite 1610, New Orleans, LA 70112 USA
[2] Tulane Univ, Hlth Sci Ctr, Tulane Canc Ctr, Dept Pathol, New Orleans, LA 70118 USA
基金
美国国家科学基金会;
关键词
Single cell RNA-seq; Next generation sequencing; QC; Housekeeping genes; SCQC; ESTABLISHMENT;
D O I
10.1016/j.ygeno.2019.02.014
中图分类号
Q81 [生物工程学(生物技术)]; Q93 [微生物学];
学科分类号
071005 ; 0836 ; 090102 ; 100705 ;
摘要
We proposed a data cleaning pipeline for single cell (SC) RNA-seq data, where we first screen genes (gene-wise screening) followed by screening cell libraries (library-wise screening). Gene-wise screening is based on the expectation that for a gene with a low technical noise, a gene's count in a library will tend to increase with the increase of library size, which was tested using negative binomial regression of gene count (as dependent variable) against library size (as independent variable). Library-wise screening is based on the expectation that across-library correlations for housekeeping (HK) genes is expected to be higher than the correlations for nonhousekeeping (NHK) genes in those libraries with low technical noise. We removed those libraries, whose mean pairwise correlation for HK genes is NOT significantly higher than that for NHK genes. We successfully applied the pipeline to two large SC RNA-seq datasets. The pipeline was also developed into an R package.
引用
收藏
页码:346 / 355
页数:10
相关论文
共 18 条
  • [1] Modeling Enzyme Processivity Reveals that RNA-Seq Libraries Are Biased in Characteristic and Correctable Ways
    Archer, Nathan
    Walsh, Mark D.
    Shahrezaei, Vahid
    Hebenstreit, Daniel
    [J]. CELL SYSTEMS, 2016, 3 (05) : 467 - +
  • [2] Brennecke P, 2013, NAT METHODS, V10, P1093, DOI [10.1038/NMETH.2645, 10.1038/nmeth.2645]
  • [3] STAR: ultrafast universal RNA-seq aligner
    Dobin, Alexander
    Davis, Carrie A.
    Schlesinger, Felix
    Drenkow, Jorg
    Zaleski, Chris
    Jha, Sonali
    Batut, Philippe
    Chaisson, Mark
    Gingeras, Thomas R.
    [J]. BIOINFORMATICS, 2013, 29 (01) : 15 - 21
  • [4] Human housekeeping genes, revisited
    Eisenberg, Eli
    Levanon, Erez Y.
    [J]. TRENDS IN GENETICS, 2013, 29 (10) : 569 - 574
  • [5] Kharchenko PV, 2014, NAT METHODS, V11, P740, DOI [10.1038/NMETH.2967, 10.1038/nmeth.2967]
  • [6] Characterizing noise structure in single-cell RNA-seq distinguishes genuine from technical stochastic allelic expression
    Kim, Jong Kyoung
    Kolodziejczyk, Aleksandra A.
    Illicic, Tomislav
    Teichmann, Sarah A.
    Marioni, John C.
    [J]. NATURE COMMUNICATIONS, 2015, 6
  • [7] Establishment and characterization of six human gastric carcinoma cell lines, including one naturally infected with Epstein-Barr virus
    Ku, Ja-Lok
    Kim, Kyung-Hee
    Choi, Jin-Sung
    Kim, Sung-Hee
    Shin, Young-Kyoung
    Chang, Hee Jin
    Bae, Jae-Moon
    Kim, Young-Woo
    Lee, Jun Ho
    Yang, Han-Kwang
    Kim, Woo-Ho
    Jeong, Seung-Yong
    Park, Jae-Gahb
    [J]. CELLULAR ONCOLOGY, 2012, 35 (02) : 127 - 136
  • [8] Advances in understanding tumour evolution through single-cell sequencing
    Kuipers, Jack
    Jahn, Katharina
    Beerenwinkel, Niko
    [J]. BIOCHIMICA ET BIOPHYSICA ACTA-REVIEWS ON CANCER, 2017, 1867 (02): : 127 - 138
  • [9] Single-cell sequencing
    Nawy, Tal
    [J]. NATURE METHODS, 2014, 11 (01) : 18 - 18
  • [10] Park JG, 1997, INT J CANCER, V70, P443, DOI 10.1002/(SICI)1097-0215(19970207)70:4<443::AID-IJC12>3.0.CO