SOM ensemble for unsupervised outlier analysis. Application to outlier identification in the Gaia astronomical survey

被引:25
作者
Fustes, Diego [1 ]
Dafonte, Carlos [1 ]
Arcay, Bernardino [1 ]
Manteiga, Minia [1 ]
Smith, Kester [2 ]
Vallenari, Antonella [3 ]
Luri, Xavier [4 ]
机构
[1] Univ A Coruna, Fac Informat, La Coruna 15071, Spain
[2] Max Planck Inst Astron, D-69117 Heidelberg, Germany
[3] Osserv Astron Padova, INAF, Padua, Italy
[4] Dept Astron & Meteorol ICCUB IEEC, Barcelona, Spain
关键词
Ensemble method; Self-Organizing Map; Classification outlier; Knowledge discovery in astronomy; Unsupervised classification; Gala mission; Spectrophotometry; FFT; Wavelet transform; CLASSIFICATION; NOISE; TOOL;
D O I
10.1016/j.eswa.2012.08.069
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gala is an ESA cornerstone astronomical mission that will observe with unprecedented precision positions, distances, space motions, and many physical properties of more than one billion objects in our Galaxy and beyond. It will observe all objects in the sky in the visible magnitude range from 6 to 20, up to approximately 10(9) sources. An international scientific consortium, the Gala Data Processing and Analysis Consortium (Gaia DPAC), has organized itself in several coordination units, with the aim, among others, of addressing the work of classifying the observed astronomical sources, using both supervised and unsupervised classification algorithms. This work focuses on the analysis of classification outliers by means of unsupervised classification. We present a novel method to combine SOMs trained with independent features that are calculated from spectrophotometry. The method as described here can help to improve the models used for the supervised classification of astronomical sources. Furthermore, it allows for data exploration and knowledge discovery in huge astronomical databases such as the upcoming Gaia mission. (C) 2012 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1530 / 1541
页数:12
相关论文
共 23 条
[1]  
[Anonymous], 1987, STAT DATA ANAL BASED
[2]  
[Anonymous], 1997, Data exploration using self-organizing maps, DOI DOI 10.1111/fwb.12264
[3]   Finding rare objects and building pure samples: probabilistic quasar classification from low-resolution Gaia spectra [J].
Bailer-Jones, C. A. L. ;
Smith, K. W. ;
Tiede, C. ;
Sordo, R. ;
Vallenari, A. .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2008, 391 (04) :1838-1853
[4]  
Baraldi A, 1999, IEEE T SYST MAN CY B, V29, P778, DOI 10.1109/3477.809032
[5]  
de Bruijne J.H.J., 2012, ASTROPHYSICS SPACE S, V68
[6]  
Debnath Kalyan Kumar, 2009, Proceedings of the 2009 12th International Conference on Computer and Information Technology (ICCIT 2009), P367, DOI 10.1109/ICCIT.2009.5407265
[7]   Ensemble methods in machine learning [J].
Dietterich, TG .
MULTIPLE CLASSIFIER SYSTEMS, 2000, 1857 :1-15
[8]   Noise reduction in astronomical spectra using wavelet packet [J].
Fligge, M ;
Solanki, SK .
ASTRONOMY & ASTROPHYSICS SUPPLEMENT SERIES, 1997, 124 (03) :579-587
[9]  
Fort J.-C., 2002, 10th European Symposium on Artificial Neural Networks. ESANN'2002. Proceedings, P223
[10]   Unsupervised self-organized mapping: a versatile empirical tool for object selection, classification and redshift estimation in large surveys [J].
Geach, James E. .
MONTHLY NOTICES OF THE ROYAL ASTRONOMICAL SOCIETY, 2012, 419 (03) :2633-2645