Prediction, Estimation, and Attribution

被引:18
作者
Efron, Bradley [1 ]
机构
[1] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Black box; Ephemeral predictors; Random forests; Surface plus noise; SELECTION;
D O I
10.1111/insr.12409
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The scientific needs and computational limitations of the twentieth century fashioned classical statistical methodology. Both the needs and limitations have changed in the twenty-first, and so has the methodology. Large-scale prediction algorithms-neural nets, deep learning, boosting, support vector machines, random forests-have achieved star status in the popular press. They are recognizable as heirs to the regression tradition, but ones carried out at enormous scale and on titanic datasets. How do these algorithms compare with standard regression techniques such as ordinary least squares or logistic regression? Several key discrepancies will be examined, centering on the differences between prediction and estimation or prediction and attribution (significance testing). Most of the discussion is carried out through small numerical examples.
引用
收藏
页码:S28 / S59
页数:32
相关论文
共 23 条
  • [1] Achille A, 2018, J MACH LEARN RES, V19
  • [2] [Anonymous], 2016, 160605390 ARXIV
  • [3] [Anonymous], 2019, DERIVATION VAL UNPUB
  • [4] [Anonymous], 2019, 190108152 ARXIV
  • [5] SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation
    Blewitt, Marnie E.
    Gendrel, Anne-Valerie
    Pang, Zhenyi
    Sparrow, Duncan B.
    Whitelaw, Nadia
    Craig, Jeffrey M.
    Apedaile, Anwyn
    Hilton, Douglas J.
    Dunwoodie, Sally L.
    Brockdorff, Neil
    Kay, Graham F.
    Whitelaw, Emma
    [J]. NATURE GENETICS, 2008, 40 (05) : 663 - 669
  • [6] Statistical modeling: The two cultures
    Breiman, L
    [J]. STATISTICAL SCIENCE, 2001, 16 (03) : 199 - 215
  • [7] Deep neural networks are superior to dermatologists in melanoma image classification
    Brinker, Titus J.
    Hekler, Achim
    Enk, Alexander H.
    Berking, Carola
    Haferkamp, Sebastian
    Hauschild, Axel
    Weichenthal, Michael
    Klode, Joachim
    Schadendorf, Dirk
    Holland-Letz, Tim
    von Kalle, Christof
    Froehling, Stefan
    Schilling, Bastian
    Utikal, Jochen S.
    [J]. EUROPEAN JOURNAL OF CANCER, 2019, 119 : 11 - 17
  • [8] EFRON B, 1991, J AM STAT ASSOC, V86, P9, DOI 10.2307/2289707
  • [9] Efron B., 2010, I MATH STAT MONOGRAP, V1
  • [10] Tweedie's Formula and Selection Bias
    Efron, Bradley
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (496) : 1602 - 1614