PHOTO-z ESTIMATION: AN EXAMPLE OF NONPARAMETRIC CONDITIONAL DENSITY ESTIMATION UNDER SELECTION BIAS

被引:18
作者
Izbicki, Rafael [1 ]
Lee, Ann B. [2 ]
Freeman, Peter E. [2 ]
机构
[1] Univ Fed Sao Carlos, Dept Estat, Rodovia Washington Luis,Km 235 SP 310, Sao Paulo, Brazil
[2] Carnegie Mellon Univ, Dept Stat, Baker Hall, Pittsburgh, PA 15213 USA
基金
美国国家科学基金会; 巴西圣保罗研究基金会;
关键词
Density estimation; nonparametric statistics; selection bias; photometric redshift; REDSHIFT DISTRIBUTION; COVARIATE SHIFT; INFERENCE;
D O I
10.1214/16-AOAS1013
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Redshift is a key quantity for inferring cosmological model parameters. In photometric redshift estimation, cosmologists use the coarse data collected from the vast majority of galaxies to predict the redshift of individual galaxies. To properly quantify the uncertainty in the predictions, however, one needs to go beyond standard regression and instead estimate the full conditional density f (z|x) of a galaxy's redshift z given its photometric covariates x. The problem is further complicated by selection bias: usually only the rarest and brightest galaxies have known redshifts, and these galaxies have characteristics and measured covariates that do not necessarily match those of more numerous and dimmer galaxies of unknown redshift. Unfortunately, there is not much research on how to best estimate complex multivariate densities in such settings. Here we describe a general framework for properly constructing and assessing nonparametric conditional density estimators under selection bias, and for combining two or more estimators for optimal performance. We propose new improved photo-z estimators and illustrate our methods on data from the Sloan Data Sky Survey and an application to galaxy-galaxy lensing. Although our main application is photo-z estimation, our methods are relevant to any high-dimensional regression setting with complicated asymmetric and multimodal distributions in the response variable.
引用
收藏
页码:698 / 724
页数:27
相关论文
共 38 条
  • [1] THE EIGHTH DATA RELEASE OF THE SLOAN DIGITAL SKY SURVEY: FIRST DATA FROM SDSS-III
    Aihara, Hiroaki
    Allende Prieto, Carlos
    An, Deokkeun
    Anderson, Scott F.
    Aubourg, Eric
    Balbinot, Eduardo
    Beers, Timothy C.
    Berlind, Andreas A.
    Bickerton, Steven J.
    Bizyaev, Dmitry
    Blanton, Michael R.
    Bochanski, John J.
    Bolton, Adam S.
    Bovy, Jo
    Brandt, W. N.
    Brinkmann, J.
    Brown, Peter J.
    Brownstein, Joel R.
    Busca, Nicolas G.
    Campbell, Heather
    Carr, Michael A.
    Chen, Yanmei
    Chiappini, Cristina
    Comparat, Johan
    Connolly, Natalia
    Cortes, Marina
    Croft, Rupert A. C.
    Cuesta, Antonio J.
    da Costa, Luiz N.
    Davenport, James R. A.
    Dawson, Kyle
    Dhital, Saurav
    Ealet, Anne
    Ebelke, Garrett L.
    Edmondson, Edward M.
    Eisenstein, Daniel J.
    Escoffier, Stephanie
    Esposito, Massimiliano
    Evans, Michael L.
    Fan, Xiaohui
    Femenia Castella, Bruno
    Font-Ribera, Andreu
    Frinchaboy, Peter M.
    Ge, Jian
    Gillespie, Bruce A.
    Gilmore, G.
    Gonzalez Hernandez, Jonay I.
    Gott, J. Richard
    Gould, Andrew
    Grebel, Eva K.
    [J]. ASTROPHYSICAL JOURNAL SUPPLEMENT SERIES, 2011, 193 (02)
  • [2] [Anonymous], 2016, Journal of Computational and Graphical Statistics, DOI [DOI 10.1080/10618600.2015.1094393, 10.1080/10618600.2015.1094393]
  • [3] DATA MINING AND MACHINE LEARNING IN ASTRONOMY
    Ball, Nicholas M.
    Brunner, Robert J.
    [J]. INTERNATIONAL JOURNAL OF MODERN PHYSICS D, 2010, 19 (07): : 1049 - 1106
  • [4] Bickel S, 2009, J MACH LEARN RES, V10, P2137
  • [5] CORRADI V., 2006, HDB EC FORECASTING
  • [6] Estimating the redshift distribution of photometric galaxy samples - II. Applications and tests of a new method
    Cunha, Carlos E.
    Lima, Marcos
    Oyaizu, Hiroaki
    Frieman, Joshua
    Lin, Huan
    [J]. MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2009, 396 (04) : 2379 - 2398
  • [7] A CRITICAL ASSESSMENT OF PHOTOMETRIC REDSHIFT METHODS: A CANDELS INVESTIGATION
    Dahlen, Tomas
    Mobasher, Bahram
    Faber, Sandra M.
    Ferguson, Henry C.
    Barro, Guillermo
    Finkelstein, Steven L.
    Finlator, Kristian
    Fontana, Adriano
    Gruetzbauch, Ruth
    Johnson, Seth
    Pforr, Janine
    Salvato, Mara
    Wiklind, Tommy
    Wuyts, Stijn
    Acquaviva, Viviana
    Dickinson, Mark E.
    Guo, Yicheng
    Huang, Jiasheng
    Huang, Kuang-Han
    Newman, Jeffrey A.
    Bell, Eric F.
    Conselice, Christopher J.
    Galametz, Audrey
    Gawiser, Eric
    Giavalisco, Mauro
    Grogin, Norman A.
    Hathi, Nimish
    Kocevski, Dale
    Koekemoer, Anton M.
    Koo, David C.
    Lee, Kyoung-Soo
    McGrath, Elizabeth J.
    Papovich, Casey
    Peth, Michael
    Ryan, Russell
    Somerville, Rachel
    Weiner, Benjamin
    Wilson, Grant
    [J]. ASTROPHYSICAL JOURNAL, 2013, 775 (02)
  • [8] A new catalog of photometric redshifts in the hubble deep field
    Fernández-Soto, A
    Lanzetta, KM
    Yahil, A
    [J]. ASTROPHYSICAL JOURNAL, 1999, 513 (01) : 34 - 50
  • [9] GRETTON A., 2010, DATASET SHIFT MACHIN
  • [10] ON KULLBACK-LEIBLER LOSS AND DENSITY-ESTIMATION
    HALL, P
    [J]. ANNALS OF STATISTICS, 1987, 15 (04) : 1491 - 1519