Prior robust empirical Bayes inference for large-scale data by conditioning on rank with application to microarray data

被引:3
作者
Liao, J. G. [1 ]
McMurry, Timothy [2 ]
Berg, Arthur [1 ]
机构
[1] Penn State Univ, Div Biostat & Bioinformat, Hershey, PA 17033 USA
[2] Univ Virginia, Div Biostat, Charlottesville, VA 22908 USA
关键词
Bayesian shrinkage; Confidence intervals; Ranking bias; Robust multiple estimation; MULTIPLE CONFIDENCE-INTERVALS; GENE-EXPRESSION; SELECTION; MODEL;
D O I
10.1093/biostatistics/kxt026
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Empirical Bayes methods have been extensively used for microarray data analysis by modeling the large number of unknown parameters as random effects. Empirical Bayes allows borrowing information across genes and can automatically adjust for multiple testing and selection bias. However, the standard empirical Bayes model can perform poorly if the assumed working prior deviates from the true prior. This paper proposes a new rank-conditioned inference in which the shrinkage and confidence intervals are based on the distribution of the error conditioned on rank of the data. Our approach is in contrast to a Bayesian posterior, which conditions on the data themselves. The new method is almost as efficient as standard Bayesian methods when the working prior is close to the true prior, and it is much more robust when the working prior is not close. In addition, it allows a more accurate (but also more complex) non-parametric estimate of the prior to be easily incorporated, resulting in improved inference. The new method's prior robustness is demonstrated via simulation experiments. Application to a breast cancer gene expression microarray dataset is presented. Our R package rank. Shrinkage provides a ready-to-use implementation of the proposed methodology.
引用
收藏
页码:60 / 73
页数:14
相关论文
共 50 条
  • [31] Large-scale Data-driven Segmentation of Banking Customers
    Hossain, Md Monir
    Sebestyen, Mark
    Mayank, Dhruv
    Ardakanian, Omid
    Khazaei, Hamzeh
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 4392 - 4401
  • [32] Challenges and prospects in the analysis of large-scale gene expression data
    Ihmeis, JH
    Bergmann, S
    BRIEFINGS IN BIOINFORMATICS, 2004, 5 (04) : 313 - 327
  • [33] Analyzing large-scale spiking neural data wth HRLAnalysis™
    Thibeault, Corey M.
    O'Brien, Michael J.
    Srinivasa, Narayan
    FRONTIERS IN NEUROINFORMATICS, 2014, 8
  • [34] Normalization of Large-Scale Transcriptome Data Using Heuristic Methods
    Yosef, Arthur
    Shnaider, Eli
    Schneider, Moti
    Gurevich, Michael
    BIOINFORMATICS AND BIOLOGY INSIGHTS, 2023, 17
  • [35] BIG: a large-scale data integration tool for renal physiology
    Zhao, Yue
    Yang, Chin-Rang
    Raghuram, Viswanathan
    Parulekar, Jaya
    Knepper, Mark A.
    AMERICAN JOURNAL OF PHYSIOLOGY-RENAL PHYSIOLOGY, 2016, 311 (04) : F787 - F792
  • [36] ON THE SENSITIVITY OF FEATURE RANKED LISTS FOR LARGE-SCALE BIOLOGICAL DATA
    Gawel, Danuta
    Fujarewicz, Krzysztof
    MATHEMATICAL BIOSCIENCES AND ENGINEERING, 2013, 10 (03) : 667 - 690
  • [37] Parallel Clustering Algorithm for Large-Scale Biological Data Sets
    Wang, Minchao
    Zhang, Wu
    Ding, Wang
    Dai, Dongbo
    Zhang, Huiran
    Xie, Hao
    Chen, Luonan
    Guo, Yike
    Xie, Jiang
    PLOS ONE, 2014, 9 (04):
  • [38] Graph Databases for Large-Scale Healthcare Systems: A Framework for Efficient Data Management and Data Services
    Park, Yubin
    Shankar, Mallikarjun
    Park, Byung-Hoon
    Ghosh, Joydeep
    2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2014, : 12 - 19
  • [39] Network and data location aware approach for simultaneous job scheduling and data replication in large-scale data grid environments
    Mansouri, Najme
    FRONTIERS OF COMPUTER SCIENCE, 2014, 8 (03) : 391 - 408
  • [40] Detecting and correcting systematic variation in large-scale RNA sequencing data
    Li, Sheng
    Labaj, Pawel P.
    Zumbo, Paul
    Sykacek, Peter
    Shi, Wei
    Shi, Leming
    Phan, John
    Wu, Po-Yen
    Wang, May
    Wang, Charles
    Thierry-Mieg, Danielle
    Thierry-Mieg, Jean
    Kreil, David P.
    Mason, Christopher E.
    NATURE BIOTECHNOLOGY, 2014, 32 (09) : 888 - 895