Refining Exoplanet Detection Using Supervised Learning and Feature Engineering

被引:3
作者
Bugueno, Margarita [1 ]
Mena, Francisco [1 ]
Araya, Mauricio [2 ]
机构
[1] Univ Tecn Federico Santa Maria, Dept Informat, Santiago, Chile
[2] Univ Tecn Federico Santa Maria, Dept Informat, Valparaiso, Chile
来源
2018 XLIV LATIN AMERICAN COMPUTER CONFERENCE (CLEI 2018) | 2018年
关键词
Machine Learning; Exoplanet Detection; Feature Engineering; FREQUENCY-ANALYSIS; TIME; DIMENSIONALITY; PCA;
D O I
10.1109/CLEI.2018.00041
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The field of astronomical data analysis has experienced an important paradigm shift in the recent years. The automation of certain analysis procedures is no longer a desirable feature for reducing the human effort, but a must have asset for coping with the extremely large datasets that new instrumentation technologies are producing. In particular, the detection of transit planets - bodies that move across the face of another body - is an ideal setup for intelligent automation. Knowing if the variation within a light curve is evidence of a planet, requires applying advanced pattern recognition methods to a very large number of candidate stars. Here we present a supervised learning approach to refine the results produced by a case-by-case analysis of light-curves, harnessing the generalization power of machine learning techniques to predict the currently unclassified light-curves. The method uses feature engineering to find a suitable representation for classification, and different performance criteria to evaluate them and decide. Our results show that this automatic technique can help to speed up the very time-consuming manual process that is currently done by scientific experts.
引用
收藏
页码:278 / 287
页数:10
相关论文
共 29 条
  • [1] [Anonymous], 2015, ARXIV150600010
  • [2] [Anonymous], 2003, C45 CLASS IMBALANCE
  • [3] Bae K, 2003, LECT NOTES COMPUT SC, V2688, P838
  • [4] A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine
    Cao, LJ
    Chua, KS
    Chong, WK
    Lee, HP
    Gu, QM
    [J]. NEUROCOMPUTING, 2003, 55 (1-2) : 321 - 336
  • [5] Cho K., 2014, P 2014 C EMP METH NA, P1724
  • [6] COX DR, 1958, J R STAT SOC B, V20, P215
  • [7] Dasarathy B.V., 1991, Nearest Neighbor Norms: NN Pattern Classification Techniques
  • [8] Donalek Ciro, 2013, 2013 IEEE International Conference on Big Data, P35, DOI 10.1109/BigData.2013.6691731
  • [9] Falk M., 2012, A First Course on Time Series Analysis - Examples with SAS
  • [10] Automated sleep stage identification system based on time-frequency analysis of a single EEG channel and random forest classifier
    Fraiwan, Luay
    Lweesy, Khaldon
    Khasawneh, Natheer
    Wenz, Heinrich
    Dickhaus, Hartmut
    [J]. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2012, 108 (01) : 10 - 19