Genetic Algorithm for the Mutual Information-Based Feature Selection in Univariate Time Series Data

被引:9
作者
Siddiqi, Umair F. [1 ]
Sait, Sadiq M. [1 ,2 ]
Kaynak, Okyay [3 ]
机构
[1] King Fahd Univ Petr & Minerals, Ctr Commun & IT Res, Res Inst, Dhahran 31261, Saudi Arabia
[2] King Fahd Univ Petr & Minerals, Dept Comp Engn, Dhahran 31261, Saudi Arabia
[3] Bogazici Univ, Dept Elect & Elect Engn, TR-80815 Istanbul, Turkey
来源
IEEE ACCESS | 2020年 / 8卷 / 08期
关键词
Feature selection; genetic algorithm; machine learning; deep learning; optimization methods; forecasting;
D O I
10.1109/ACCESS.2020.2964803
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Filters are the fastest among the different types of feature selection methods. They employ metrics from information theory, such as mutual information (MI), Joint-MI (JMI), and minimal redundancy and maximal relevance (mRMR). The determination of the optimal feature selection set is an NP-hard problem. This work proposes the engineering of the Genetic Algorithm (GA) in which the fitness of solutions consists of two terms. The first is a feature selection metric such as MI, JMI, and mRMR, and the second term is the overlapping-coefficient that accounts for the diversity in the GA population. Experimental results show that the proposed algorithm can return multiple good quality solutions that also have minimal overlap with each other. Numerous solutions provide significant benefits when the test data contains none or missing values. Experiments were conducted using two publicly available time-series datasets. The feature sets are also applied to perform forecasting using a simple Long Short-Term Memory (LSTM) model, and the solution quality of the forecasting using different feature sets is analyzed. The proposed algorithm was compared with a popular optimization tool 'Basic Open-source Nonlinear Mixed INteger programming' (BONMIN), and a recent feature selection algorithm 'Conditional Mutual Information Considering Feature Interaction' (CMFSI). The experiments show that the multiple solutions found by the proposed method have good quality and minimal overlap.
引用
收藏
页码:9597 / 9609
页数:13
相关论文
共 29 条
  • [1] Using the mutual information technique to select explanatory variables in artificial neural networks for rainfall forecasting
    Babel, Mukand S.
    Badgujar, Girish B.
    Shinde, Victor R.
    [J]. METEOROLOGICAL APPLICATIONS, 2015, 22 (03) : 610 - 616
  • [2] Mutual information and sensitivity analysis for feature selection in customer targeting: A comparative study
    Barraza, Nestor
    Moro, Sergio
    Ferreyra, Marcelo
    de la Pena, Adolfo
    [J]. JOURNAL OF INFORMATION SCIENCE, 2019, 45 (01) : 53 - 67
  • [3] USING MUTUAL INFORMATION FOR SELECTING FEATURES IN SUPERVISED NEURAL-NET LEARNING
    BATTITI, R
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (04): : 537 - 550
  • [4] Feature selection using Joint Mutual Information Maximisation
    Bennasar, Mohamed
    Hicks, Yulia
    Setchi, Rossitza
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (22) : 8520 - 8532
  • [5] TRAINING A 3-NODE NEURAL NETWORK IS NP-COMPLETE
    BLUM, AL
    RIVEST, RL
    [J]. NEURAL NETWORKS, 1992, 5 (01) : 117 - 127
  • [6] Benchmark for filter methods for feature selection in high-dimensional classification data
    Bommert, Andrea
    Sun, Xudong
    Bischl, Bernd
    Rahnenfuehrer, Joerg
    Lang, Michel
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 143
  • [7] An algorithmic framework for convex mixed integer nonlinear programs
    Bonami, Pierre
    Biegler, Lorenz T.
    Conna, Andrew R.
    Cornuejols, Gerard
    Grossmann, Ignacio E.
    Laird, Carl D.
    Lee, Jon
    Lodi, Andrea
    Margot, Francois
    Sawaya, Nicolas
    Wachter, Andreas
    [J]. DISCRETE OPTIMIZATION, 2008, 5 (02) : 186 - 204
  • [8] Borgonovo E., 2017, SENSITIVITY ANAL, V251
  • [9] A survey on feature selection methods
    Chandrashekar, Girish
    Sahin, Ferat
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2014, 40 (01) : 16 - 28
  • [10] Feature selection for time series prediction - A combined filter and wrapper approach for neural networks
    Crone, Sven F.
    Kourentzes, Nikolaos
    [J]. NEUROCOMPUTING, 2010, 73 (10-12) : 1923 - 1936