Prior robust empirical Bayes inference for large-scale data by conditioning on rank with application to microarray data

被引:3
|
作者
Liao, J. G. [1 ]
McMurry, Timothy [2 ]
Berg, Arthur [1 ]
机构
[1] Penn State Univ, Div Biostat & Bioinformat, Hershey, PA 17033 USA
[2] Univ Virginia, Div Biostat, Charlottesville, VA 22908 USA
关键词
Bayesian shrinkage; Confidence intervals; Ranking bias; Robust multiple estimation; MULTIPLE CONFIDENCE-INTERVALS; GENE-EXPRESSION; SELECTION; MODEL;
D O I
10.1093/biostatistics/kxt026
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Empirical Bayes methods have been extensively used for microarray data analysis by modeling the large number of unknown parameters as random effects. Empirical Bayes allows borrowing information across genes and can automatically adjust for multiple testing and selection bias. However, the standard empirical Bayes model can perform poorly if the assumed working prior deviates from the true prior. This paper proposes a new rank-conditioned inference in which the shrinkage and confidence intervals are based on the distribution of the error conditioned on rank of the data. Our approach is in contrast to a Bayesian posterior, which conditions on the data themselves. The new method is almost as efficient as standard Bayesian methods when the working prior is close to the true prior, and it is much more robust when the working prior is not close. In addition, it allows a more accurate (but also more complex) non-parametric estimate of the prior to be easily incorporated, resulting in improved inference. The new method's prior robustness is demonstrated via simulation experiments. Application to a breast cancer gene expression microarray dataset is presented. Our R package rank. Shrinkage provides a ready-to-use implementation of the proposed methodology.
引用
收藏
页码:60 / 73
页数:14
相关论文
共 50 条
  • [1] Two-Stage Robust and Sparse Distributed Statistical Inference for Large-Scale Data
    Mozafari-Majd, Emadaldin
    Koivunen, Visa
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2022, 70 : 5351 - 5365
  • [2] Empirical Bayes analysis of unreplicated microarray data
    Cho, HyungJun
    Kang, Jaewoo
    Lee, Jae K.
    COMPUTATIONAL STATISTICS, 2009, 24 (03) : 393 - 408
  • [3] INFERENCE ON LOW-RANK DATA MATRICES WITH APPLICATIONS TO MICROARRAY DATA
    Feng, Xingdong
    He, Xuming
    ANNALS OF APPLIED STATISTICS, 2009, 3 (04) : 1634 - 1654
  • [4] Empirical Bayes Estimates for Large-Scale Prediction Problems
    Efron, Bradley
    JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2009, 104 (487) : 1015 - 1028
  • [5] Empirical Bayes identication of tumor progression genes from microarray data
    Ghosh, Debashis
    Chinnaiyan, Arul M.
    BIOMETRICAL JOURNAL, 2007, 49 (01) : 68 - 77
  • [6] Data-Driven Robust and Sparse Solutions for Large-scale Fuzzy Portfolio Optimization
    Yu, Na
    Liang, You
    Thavaneswaran, A.
    2021 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2021), 2021,
  • [7] Application of large-scale L2-SVM for microarray classification
    Li, Baosheng
    Han, Baole
    Qin, Chuandong
    JOURNAL OF SUPERCOMPUTING, 2022, 78 (02) : 2265 - 2286
  • [8] Semi-parametric inference for large-scale data with temporally dependent noise
    Zhang, Chunming
    Guo, Xiao
    Chen, Min
    Du, Xinze
    ELECTRONIC JOURNAL OF STATISTICS, 2023, 17 (02): : 2962 - 3007
  • [9] Data Provenance in Large-Scale Distribution
    Zhu, Yunan
    Che, Wei
    Shan, Chao
    Zhao, Shen
    ARTIFICIAL INTELLIGENCE AND SECURITY, ICAIS 2022, PT III, 2022, 13340 : 28 - 42
  • [10] Simphony: simulating large-scale, rhythmic data
    Singer, Jordan M.
    Fu, Darwin Y.
    Hughey, Jacob J.
    PEERJ, 2019, 7