A modified genetic algorithm and weighted principal component analysis based feature selection and extraction strategy in agriculture

被引:18
|
作者
Shastry, K. Aditya [1 ]
Sanjay, H. A. [1 ]
机构
[1] Nitte Meenakshi Inst Technol, Bengaluru 64, India
关键词
Feature selection; Feature extraction; Hybrid; Genetic Algorithm; Weighted-Principal Component Analysis;
D O I
10.1016/j.knosys.2021.107460
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data pre-processing is a technique that transforms the raw data into a useful format for applying machine learning (ML) techniques. Feature selection (FS) and feature extraction (FeExt) form significant components of data pre-processing. FS is the identification of relevant features that enhances the accuracy of a model. Since, agricultural data contain diverse features related to climate, soil, fertilizer, FS attains significant importance as irrelevant features may adversely impact the prediction of the model built. Likewise, FeExt involves the derivation of new attributes from the prevailing attributes. All the information that the original attributes possess is present in these new features minus the duplicity. Keeping these points in mind, this work proposes a hybrid feature selection and feature extraction strategy for selecting features from the agricultural data set. A modified-Genetic Algorithm (m-GA) was developed by designing a fitness function based on "Mutual Information" (MutInf), and "Root Mean Square Error" (RtMSE) to choose the best features that affected the target attribute (crop yield in this case). These selected features were then subjected to feature extraction using "weighted principal component analysis" (wgt-PCA). The extracted features were then fed into different ML models viz. "Regression" (Reg), "Artificial Neural Networks" (ArtNN), "Adaptive Neuro Fuzzy Inference System" (ANFIS), "Ensemble of Trees" (EnT), and "Support Vector Regression" (SuVR). Trials on 3 benchmark and 8 real-world farming datasets revealed that the developed hybrid feature selection and extraction technique performed with significant improvements with respect to Rsq2, RtMSE, and "mean absolute error" (MAE) in comparison to FS and FeExt methods such as Correlation Analysis (CA), Singular Valued Decomposition (SiVD), Genetic Algorithm (GA), and wgt-PCA on "benchmark" and "real-world" farming datasets. (C) 2021 Published by Elsevier B.V.
引用
收藏
页数:18
相关论文
共 50 条
  • [41] Use of a genetic algorithm for factor selection in principal component regression
    Frost, VJ
    Molt, K
    JOURNAL OF NEAR INFRARED SPECTROSCOPY, VOL 6 1998, 1998, : A185 - A190
  • [42] Genetic algorithm based feature selection method development for pattern recognition
    Kim, Ho-Duck
    Park, Chang-Hyun
    Yang, Hyun-Chang
    Sim, Kwee-Bo
    2006 SICE-ICASE INTERNATIONAL JOINT CONFERENCE, VOLS 1-13, 2006, : 5382 - +
  • [43] Performance comparison of genetic algorithm and principal component analysis methods for ECG signal extraction
    Balambigai, S.
    Asokan, R.
    INTERNATIONAL JOURNAL OF HEALTHCARE TECHNOLOGY AND MANAGEMENT, 2011, 12 (5-6) : 379 - 389
  • [44] Input feature extraction for multilayered perceptrons using supervised principal component analysis
    Perantonis, SJ
    Virvilis, V
    NEURAL PROCESSING LETTERS, 1999, 10 (03) : 243 - 252
  • [45] Feature selection based-on genetic algorithm for CBIR
    Zhao, Tianzhong
    Lu, Jianjiang
    Zhang, Yafei
    Xiao, Qi
    CISP 2008: FIRST INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, VOL 2, PROCEEDINGS, 2008, : 495 - 499
  • [46] A novel feature extraction approach based on ensemble feature selection and modified discriminant independent component analysis for microarray data classification
    Mollaee, Maryam
    Moattar, Mohammad Hossein
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2016, 36 (03) : 521 - 529
  • [47] Clustering and feature selection using sparse principal component analysis
    Ronny Luss
    Alexandre d’Aspremont
    Optimization and Engineering, 2010, 11 : 145 - 157
  • [48] Input Feature Extraction for Multilayered Perceptrons Using Supervised Principal Component Analysis
    Stavros J. Perantonis
    Vassilis Virvilis
    Neural Processing Letters, 1999, 10 : 243 - 252
  • [49] Clustering and feature selection using sparse principal component analysis
    Luss, Ronny
    d'Aspremont, Alexandre
    OPTIMIZATION AND ENGINEERING, 2010, 11 (01) : 145 - 157
  • [50] Principal component-based feature selection for tumor classification
    Sun, Lin
    Xu, Jiucheng
    Yin, Ying
    BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S2011 - S2017