Estimation of a non-parametric variable importance measure of a continuous exposure

被引:12
作者
Chambaz, Antoine [1 ,2 ]
Neuvial, Pierre [3 ]
van der Laan, Mark J. [4 ,5 ]
机构
[1] Univ Paris 05, MAP5, Paris, France
[2] CNRS, F-75700 Paris, France
[3] Univ Evry Val dEssonne, Lab Stat & Genome, UMR CNRS USC INRA 8071, Evry, France
[4] Univ Calif Berkeley, Div Biostat, Berkeley, CA 94720 USA
[5] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
来源
ELECTRONIC JOURNAL OF STATISTICS | 2012年 / 6卷
关键词
Variable importance measure; non-parametric estimation; targeted minimum loss estimation; robustness; asymptotics; COPY NUMBER; EXPRESSION; GENOME;
D O I
10.1214/12-EJS703
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We define a new measure of variable importance of an exposure on a continuous outcome, accounting for potential confounders. The exposure features a reference level x(0) with positive mass and a continuum of other levels. For the purpose of estimating it, we fully develop the semi-parametric estimation methodology called targeted minimum loss estimation methodology (TMLE) [23, 22]. We cover the whole spectrum of its theoretical study (convergence of the iterative procedure which is at the core of the TMLE methodology; consistency and asymptotic normality of the estimator), practical implementation, simulation study and application to a genomic example that originally motivated this article. In the latter, the exposure X and response Y are, respectively, the DNA copy number and expression level of a given gene in a cancer cell. Here, the reference level is x(0) = 2, that is the expected DNA copy number in a normal cell. The confounder is a measure of the methylation of the gene. The fact that there is no clear biological indication that X and Y can be interpreted as an exposure and a response, respectively, is not problematic.
引用
收藏
页码:1059 / 1099
页数:41
相关论文
共 26 条
  • [1] Multi-Platform Whole-Genome Microarray Analyses Refine the Epigenetic Signature of Breast Cancer Metastasis with Gene Expression and Copy Number
    Andrews, Joseph
    Kennette, Wendy
    Pilon, Jenna
    Hodgson, Alexandra
    Tuck, Alan B.
    Chambers, Ann F.
    Rodenhiser, David I.
    [J]. PLOS ONE, 2010, 5 (01):
  • [2] [Anonymous], 2010, R LANG ENV STAT COMP
  • [3] Integrated genomic analyses of ovarian carcinoma
    Bell, D.
    Berchuck, A.
    Birrer, M.
    Chien, J.
    Cramer, D. W.
    Dao, F.
    Dhir, R.
    DiSaia, P.
    Gabra, H.
    Glenn, P.
    Godwin, A. K.
    Gross, J.
    Hartmann, L.
    Huang, M.
    Huntsman, D. G.
    Iacocca, M.
    Imielinski, M.
    Kalloger, S.
    Karlan, B. Y.
    Levine, D. A.
    Mills, G. B.
    Morrison, C.
    Mutch, D.
    Olvera, N.
    Orsulic, S.
    Park, K.
    Petrelli, N.
    Rabeno, B.
    Rader, J. S.
    Sikic, B. I.
    Smith-McCune, K.
    Sood, A. K.
    Bowtell, D.
    Penny, R.
    Testa, J. R.
    Chang, K.
    Dinh, H. H.
    Drummond, J. A.
    Fowler, G.
    Gunaratne, P.
    Hawes, A. C.
    Kovar, C. L.
    Lewis, L. R.
    Morgan, M. B.
    Newsham, I. F.
    Santibanez, J.
    Reid, J. G.
    Trevino, L. R.
    Wu, Y. -Q.
    Wang, M.
    [J]. NATURE, 2011, 474 (7353) : 609 - 615
  • [4] Bembom O, 2007, STAT APPL GENET MOL, V6
  • [5] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [6] Comprehensive genomic characterization defines human glioblastoma genes and core pathways
    Chin, L.
    Meyerson, M.
    Aldape, K.
    Bigner, D.
    Mikkelsen, T.
    VandenBerg, S.
    Kahn, A.
    Penny, R.
    Ferguson, M. L.
    Gerhard, D. S.
    Getz, G.
    Brennan, C.
    Taylor, B. S.
    Winckler, W.
    Park, P.
    Ladanyi, M.
    Hoadley, K. A.
    Verhaak, R. G. W.
    Hayes, D. N.
    Spellman, Paul T.
    Absher, D.
    Weir, B. A.
    Ding, L.
    Wheeler, D.
    Lawrence, M. S.
    Cibulskis, K.
    Mardis, E.
    Zhang, Jinghui
    Wilson, R. K.
    Donehower, L.
    Wheeler, D. A.
    Purdom, E.
    Wallis, J.
    Laird, P. W.
    Herman, J. G.
    Schuebel, K. E.
    Weisenberger, D. J.
    Baylin, S. B.
    Schultz, N.
    Yao, Jun
    Wiedemeyer, R.
    Weinstein, J.
    Sander, C.
    Gibbs, R. A.
    Gray, J.
    Kucherlapati, R.
    Lander, E. S.
    Myers, R. M.
    Perou, C. M.
    McLendon, Roger
    [J]. NATURE, 2008, 455 (7216) : 1061 - 1068
  • [7] Mapping the cancer genome - Pinpointing the genes involved in cancer will help chart a new course across the complex landscape of human malignancies
    Collins, Francis S.
    Barker, Anna D.
    [J]. SCIENTIFIC AMERICAN, 2007, 296 (03) : 50 - 57
  • [8] Dimitriadou E., 2011, Misc functions of the Department of Statistics (e1071)
  • [9] Hanson R., 1995, SOLVING LEAST SQUARE, V15
  • [10] HASTIE T., 2011, GEN ADDITIVE MODELS