Decentralized nonparametric multiple testing

被引:2
作者
Mukhopadhyay, Subhadeep [1 ]
机构
[1] Temple Univ, Dept Stat Sci, Philadelphia, PA 19122 USA
关键词
Comparison density; decentralised large-scale inference; LP-Fourier transform; superposition principle; POWER;
D O I
10.1080/10485252.2018.1508678
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Consider a big data multiple testing task, where, due to storage and computational bottlenecks, one is given a very large collection of p-values by splitting into manageable chunks and distributing over thousands of computer nodes. This paper is concerned with the following question: How can we find the full data multiple testing solution by operating completely independently on individual machines in parallel, without any data exchange between nodes? This version of the problem tends naturally to arise in a wide range of data-intensive science and industry applications whose methodological solution has not appeared in the literature to date; therefore, we feel it is necessary to undertake such analysis. Based on the nonparametric functional statistical viewpoint of large-scale inference, started in Mukhopadhyay, S. [(2016), Large Scale Signal Detection: A Unifying View', Biometrics, 72, 325-334], this paper furnishes a new computing model that brings unexpected simplicity to the design of the algorithm which might otherwise seem daunting using classical approach and notations.
引用
收藏
页码:1003 / 1015
页数:13
相关论文
共 10 条
  • [1] CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING
    BENJAMINI, Y
    HOCHBERG, Y
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) : 289 - 300
  • [2] Higher criticism for detecting sparse heterogeneous mixtures
    Donoho, D
    Jin, JS
    [J]. ANNALS OF STATISTICS, 2004, 32 (03) : 962 - 994
  • [3] Empirical Bayes analysis of a microarray experiment
    Efron, B
    Tibshirani, R
    Storey, JD
    Tusher, V
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) : 1151 - 1160
  • [4] Size, power and false discovery rates
    Efron, Bradley
    [J]. ANNALS OF STATISTICS, 2007, 35 (04) : 1351 - 1377
  • [5] Ignatiadis N, 2016, NAT METHODS, V13, P577, DOI [10.1038/NMETH.3885, 10.1038/nmeth.3885]
  • [6] The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog)
    MacArthur, Jacqueline
    Bowler, Emily
    Cerezo, Maria
    Gil, Laurent
    Hall, Peggy
    Hastings, Emma
    Junkins, Heather
    McMahon, Aoife
    Milano, Annalisa
    Morales, Joannella
    Pendlington, Zoe May
    Welter, Danielle
    Burdett, Tony
    Hindorff, Lucia
    Flicek, Paul
    Cunningham, Fiona
    Parkinson, Helen
    [J]. NUCLEIC ACIDS RESEARCH, 2017, 45 (D1) : D896 - D901
  • [7] Large-scale signal detection: A unified perspective
    Mukhopadhyay, Subhadeep
    [J]. BIOMETRICS, 2016, 72 (02) : 325 - 334
  • [8] Gene expression correlates of clinical prostate cancer behavior
    Singh, D
    Febbo, PG
    Ross, K
    Jackson, DG
    Manola, J
    Ladd, C
    Tamayo, P
    Renshaw, AA
    D'Amico, AV
    Richie, JP
    Lander, ES
    Loda, M
    Kantoff, PW
    Golub, TR
    Sellers, WR
    [J]. CANCER CELL, 2002, 1 (02) : 203 - 209
  • [9] Westfall P.H., 2004, IMS LECT NOTES MONOG, V47, P143
  • [10] seeQTL: a searchable database for human eQTLs
    Xia, Kai
    Shabalin, Andrey A.
    Huang, Shunping
    Madar, Vered
    Zhou, Yi-Hui
    Wang, Wei
    Zou, Fei
    Sun, Wei
    Sullivan, Patrick F.
    Wright, Fred A.
    [J]. BIOINFORMATICS, 2012, 28 (03) : 451 - 452