Machine learning classification of SDSS transient survey images

被引:37
作者
du Buisson, L. [1 ,2 ]
Sivanandam, N. [2 ]
Bassett, Bruce A. [1 ,2 ,3 ]
Smith, M. [4 ,5 ]
机构
[1] Univ Cape Town, Dept Math & Appl Math, ZA-7700 Rondebosch, South Africa
[2] African Inst Math Sci, ZA-7945 Muizenberg, South Africa
[3] S African Astron Observ, ZA-7925 Observatory, South Africa
[4] Univ Western Cape, Dept Phys, ZA-7535 Cape Town, South Africa
[5] Univ Southampton, Sch Phys & Astron, Southampton SO17 1BJ, Hants, England
基金
美国国家科学基金会; 新加坡国家研究基金会; 美国国家航空航天局;
关键词
methods: data analysis; methods: observational; methods: statistical; techniques: image processing; techniques: photometric; surveys; II SUPERNOVA SURVEY; COSMOLOGY; DISCOVERY; AGREEMENT;
D O I
10.1093/mnras/stv2041
中图分类号
P1 [天文学];
学科分类号
0704 ;
摘要
We show that multiple machine learning algorithms can match human performance in classifying transient imaging data from the Sloan Digital Sky Survey (SDSS) supernova survey into real objects and artefacts. This is a first step in any transient science pipeline and is currently still done by humans, but future surveys such as the Large Synoptic Survey Telescope (LSST) will necessitate fully machine-enabled solutions. Using features trained from eigenimage analysis (principal component analysis, PCA) of single-epoch g, r and i difference images, we can reach a completeness (recall) of 96 per cent, while only incorrectly classifying at most 18 per cent of artefacts as real objects, corresponding to a precision (purity) of 84 per cent. In general, random forests performed best, followed by the k-nearest neighbour and the SkyNet artificial neural net algorithms, compared to other methods such as naive Bayes and kernel support vector machine. Our results show that PCA-based machine learning can match human success levels and can naturally be extended by including multiple epochs of data, transient colours and host galaxy information which should allow for significant further improvements, especially at low signal-to-noise.
引用
收藏
页码:2026 / 2038
页数:13
相关论文
共 37 条
[11]   A COEFFICIENT OF AGREEMENT FOR NOMINAL SCALES [J].
COHEN, J .
EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT, 1960, 20 (01) :37-46
[12]   Extraneous factors in judicial decisions [J].
Danziger, Shai ;
Levav, Jonathan ;
Avnaim-Pesso, Liora .
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2011, 108 (17) :6889-6892
[13]  
Fleiss JL., 2003, STAT METHODS RATES P, DOI DOI 10.1002/0471445428
[14]   The Sloan Digital Sky Survey-II Supernova Survey: Technical summary [J].
Frieman, Joshua A. ;
Bassett, Bruce ;
Becker, Andrew ;
Choi, Changsu ;
Cinabro, David ;
DeJongh, Fritz ;
Depoy, Darren L. ;
Dilday, Ben ;
Doi, Mamoru ;
Garnavich, Peter M. ;
Hogan, Craig J. ;
Holtzman, Jon ;
Im, Myungshin ;
Jha, Saurabh ;
Kessler, Richard ;
Konishi, Kohki ;
Lampeitl, Hubert ;
Marriner, John ;
Marshall, Jennifer L. ;
McGinnis, David ;
Miknaitis, Gajus ;
Nichol, Robert C. ;
Prieto, Jose Luis ;
Riess, Adam G. ;
Richmond, Michael W. ;
Romani, Roger ;
Sako, Masao ;
Schneider, Donald P. ;
Smith, Mathew ;
Takanashi, Naohiro ;
Tokita, Koutchi ;
van der Heyden, Kurt ;
Yasuda, Naoki ;
Zheng, Chen ;
Adelman-McCarthy, Jennifer ;
Annis, James ;
Assef, Roberto J. ;
Barentine, John ;
Bender, Ralf ;
Blandford, Roger D. ;
Boroski, William N. ;
Bremer, Malcolm ;
Brewington, Howard ;
Collins, Chris A. ;
Crotts, Arlin ;
Dembicky, Jack ;
Eastman, Jason ;
Edge, Alastair ;
Edmondson, Edmond ;
Elson, Edward .
ASTRONOMICAL JOURNAL, 2008, 135 (01) :338-347
[15]   A CONSTRUCTIVE METHOD FOR MULTIVARIATE FUNCTION APPROXIMATION BY MULTILAYER PERCEPTRONS [J].
GEVA, S ;
SITTE, J .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1992, 3 (04) :621-624
[16]   AUTOMATED TRANSIENT IDENTIFICATION IN THE DARK ENERGY SURVEY [J].
Goldstein, D. A. ;
D'Andrea, C. B. ;
Fischer, J. A. ;
Foley, R. J. ;
Gupta, R. R. ;
Kessler, R. ;
Kim, A. G. ;
Nichol, R. C. ;
Nugent, P. E. ;
Papadopoulos, A. ;
Sako, M. ;
Smith, M. ;
Sullivan, M. ;
Thomas, R. C. ;
Wester, W. ;
Wolf, R. C. ;
Abdalla, F. B. ;
Banerji, M. ;
Benoit-Levy, A. ;
Bertin, E. ;
Brooks, D. ;
Rosell, A. Carnero ;
Castander, F. J. ;
da Costa, L. N. ;
Covarrubias, R. ;
DePoy, D. L. ;
Desai, S. ;
Diehl, H. T. ;
Doel, P. ;
Eifler, T. F. ;
Neto, A. Fausti ;
Finley, D. A. ;
Flaugher, B. ;
Fosalba, P. ;
Frieman, J. ;
Gerdes, D. ;
Gruen, D. ;
Gruendl, R. A. ;
James, D. ;
Kuehn, K. ;
Kuropatkin, N. ;
Lahav, O. ;
Li, T. S. ;
Maia, M. A. G. ;
Makler, M. ;
March, M. ;
Marshall, J. L. ;
Martini, P. ;
Merritt, K. W. ;
Miquel, R. .
ASTRONOMICAL JOURNAL, 2015, 150 (03)
[17]   COSMOLOGY WITH PHOTOMETRIC SURVEYS OF TYPE Ia SUPERNOVAE [J].
Gong, Yan ;
Cooray, Asantha ;
Chen, Xuelei .
ASTROPHYSICAL JOURNAL, 2010, 709 (02) :1420-1428
[18]   SkyNet: an efficient and robust neural network training tool for machine learning in astronomy [J].
Graff, Philip ;
Feroz, Farhan ;
Hobson, Michael P. ;
Lasenby, Anthony .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2014, 441 (02) :1741-1759
[19]   THE MEANING AND USE OF THE AREA UNDER A RECEIVER OPERATING CHARACTERISTIC (ROC) CURVE [J].
HANLEY, JA ;
MCNEIL, BJ .
RADIOLOGY, 1982, 143 (01) :29-36
[20]  
Hastie T., 2009, ELEMENTS STAT LEARNI, DOI DOI 10.1007/978-0-387-84858-7