Refining Exoplanet Detection Using Supervised Learning and Feature Engineering

被引:4
作者
Bugueno, Margarita [1 ]
Mena, Francisco [1 ]
Araya, Mauricio [2 ]
机构
[1] Univ Tecn Federico Santa Maria, Dept Informat, Santiago, Chile
[2] Univ Tecn Federico Santa Maria, Dept Informat, Valparaiso, Chile
来源
2018 XLIV LATIN AMERICAN COMPUTER CONFERENCE (CLEI 2018) | 2018年
关键词
Machine Learning; Exoplanet Detection; Feature Engineering; FREQUENCY-ANALYSIS; TIME; DIMENSIONALITY; PCA;
D O I
10.1109/CLEI.2018.00041
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The field of astronomical data analysis has experienced an important paradigm shift in the recent years. The automation of certain analysis procedures is no longer a desirable feature for reducing the human effort, but a must have asset for coping with the extremely large datasets that new instrumentation technologies are producing. In particular, the detection of transit planets - bodies that move across the face of another body - is an ideal setup for intelligent automation. Knowing if the variation within a light curve is evidence of a planet, requires applying advanced pattern recognition methods to a very large number of candidate stars. Here we present a supervised learning approach to refine the results produced by a case-by-case analysis of light-curves, harnessing the generalization power of machine learning techniques to predict the currently unclassified light-curves. The method uses feature engineering to find a suitable representation for classification, and different performance criteria to evaluate them and decide. Our results show that this automatic technique can help to speed up the very time-consuming manual process that is currently done by scientific experts.
引用
收藏
页码:278 / 287
页数:10
相关论文
共 29 条
[1]  
[Anonymous], 2015, ARXIV150600010
[2]  
[Anonymous], 2003, C45 CLASS IMBALANCE
[3]  
Bae K, 2003, LECT NOTES COMPUT SC, V2688, P838
[4]   A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine [J].
Cao, LJ ;
Chua, KS ;
Chong, WK ;
Lee, HP ;
Gu, QM .
NEUROCOMPUTING, 2003, 55 (1-2) :321-336
[5]  
Cho K., 2014, P 2014 C EMP METH NA, P1724
[6]  
COX DR, 1958, J R STAT SOC B, V20, P215
[7]  
Dasarathy B.V., 1991, Nearest Neighbor Norms: NN Pattern Classification Techniques
[8]  
Donalek Ciro, 2013, 2013 IEEE International Conference on Big Data, P35, DOI 10.1109/BigData.2013.6691731
[9]  
Falk M., 2012, A First Course on Time Series Analysis - Examples with SAS
[10]   Automated sleep stage identification system based on time-frequency analysis of a single EEG channel and random forest classifier [J].
Fraiwan, Luay ;
Lweesy, Khaldon ;
Khasawneh, Natheer ;
Wenz, Heinrich ;
Dickhaus, Hartmut .
COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2012, 108 (01) :10-19