A modified genetic algorithm and weighted principal component analysis based feature selection and extraction strategy in agriculture

被引:18
|
作者
Shastry, K. Aditya [1 ]
Sanjay, H. A. [1 ]
机构
[1] Nitte Meenakshi Inst Technol, Bengaluru 64, India
关键词
Feature selection; Feature extraction; Hybrid; Genetic Algorithm; Weighted-Principal Component Analysis;
D O I
10.1016/j.knosys.2021.107460
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Data pre-processing is a technique that transforms the raw data into a useful format for applying machine learning (ML) techniques. Feature selection (FS) and feature extraction (FeExt) form significant components of data pre-processing. FS is the identification of relevant features that enhances the accuracy of a model. Since, agricultural data contain diverse features related to climate, soil, fertilizer, FS attains significant importance as irrelevant features may adversely impact the prediction of the model built. Likewise, FeExt involves the derivation of new attributes from the prevailing attributes. All the information that the original attributes possess is present in these new features minus the duplicity. Keeping these points in mind, this work proposes a hybrid feature selection and feature extraction strategy for selecting features from the agricultural data set. A modified-Genetic Algorithm (m-GA) was developed by designing a fitness function based on "Mutual Information" (MutInf), and "Root Mean Square Error" (RtMSE) to choose the best features that affected the target attribute (crop yield in this case). These selected features were then subjected to feature extraction using "weighted principal component analysis" (wgt-PCA). The extracted features were then fed into different ML models viz. "Regression" (Reg), "Artificial Neural Networks" (ArtNN), "Adaptive Neuro Fuzzy Inference System" (ANFIS), "Ensemble of Trees" (EnT), and "Support Vector Regression" (SuVR). Trials on 3 benchmark and 8 real-world farming datasets revealed that the developed hybrid feature selection and extraction technique performed with significant improvements with respect to Rsq2, RtMSE, and "mean absolute error" (MAE) in comparison to FS and FeExt methods such as Correlation Analysis (CA), Singular Valued Decomposition (SiVD), Genetic Algorithm (GA), and wgt-PCA on "benchmark" and "real-world" farming datasets. (C) 2021 Published by Elsevier B.V.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Supervised feature selection using principal component analysis
    Rahmat, Fariq
    Zulkafli, Zed
    Ishak, Asnor Juraiza
    Rahman, Ribhan Zafira Abdul
    De Stercke, Simon
    Buytaert, Wouter
    Tahir, Wardah
    Ab Rahman, Jamalludin
    Ibrahim, Salwa
    Ismail, Muhamad
    KNOWLEDGE AND INFORMATION SYSTEMS, 2024, 66 (03) : 1955 - 1995
  • [32] Supervised feature selection using principal component analysis
    Fariq Rahmat
    Zed Zulkafli
    Asnor Juraiza Ishak
    Ribhan Zafira Abdul Rahman
    Simon De Stercke
    Wouter Buytaert
    Wardah Tahir
    Jamalludin Ab Rahman
    Salwa Ibrahim
    Muhamad Ismail
    Knowledge and Information Systems, 2024, 66 : 1955 - 1995
  • [33] A Clustering Based Genetic Algorithm for Feature Selection
    Rostami, Mehrdad
    Moradi, Parham
    2014 6TH CONFERENCE ON INFORMATION AND KNOWLEDGE TECHNOLOGY (IKT), 2014, : 112 - 116
  • [34] Feature subset selection based on the genetic algorithm
    Yang, Jingwei
    Wang, Sile
    Chen, Yingyi
    Lu, Sukui
    Yang, Wenzhu
    ADVANCED TECHNOLOGIES IN MANUFACTURING, ENGINEERING AND MATERIALS, PTS 1-3, 2013, 774-776 : 1532 - +
  • [35] Deluge based Genetic Algorithm for feature selection
    Ritam Guha
    Manosij Ghosh
    Souvik Kapri
    Sushant Shaw
    Shyok Mutsuddi
    Vikrant Bhateja
    Ram Sarkar
    Evolutionary Intelligence, 2021, 14 : 357 - 367
  • [36] Application of Modified Genetic Algorithm in Feature extraction of the Unstructured Data
    Du, Nan
    Peng, Hong
    Zhang, Wenfeng
    INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER CONTROL : ICACC 2009 - PROCEEDINGS, 2009, : 124 - 128
  • [37] A Feature Selection Method Based on Feature Grouping and Genetic Algorithm
    Lin, Xiaohui
    Wang, Xiaomei
    Xiao, Niyi
    Huang, Xin
    Wang, Jue
    INTELLIGENCE SCIENCE AND BIG DATA ENGINEERING: BIG DATA AND MACHINE LEARNING TECHNIQUES, ISCIDE 2015, PT II, 2015, 9243 : 150 - 158
  • [38] SAR Target Feature Extraction and Recognition Based Multilinear Principal Component Analysis
    Hu, Liping
    Xing, Xiaoyu
    INTERNATIONAL SYMPOSIUM ON OPTOELECTRONIC TECHNOLOGY AND APPLICATION 2014: IMAGE PROCESSING AND PATTERN RECOGNITION, 2014, 9301
  • [39] Feature Extraction of Hyperspectral Image Using Principal Component Analysis and Folded-Principal Component Analysis
    Deepa, P.
    Thilagavathi, K.
    2015 2ND INTERNATIONAL CONFERENCE ON ELECTRONICS AND COMMUNICATION SYSTEMS (ICECS), 2015, : 656 - 660
  • [40] Feature extraction and selection based on genetic algorithm for hyperion hyperspectral images
    Wang Zhenhai
    Hu Guangdao
    Zhang Hongjun
    PROCEEDINGS OF 2006 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: 50 YEARS' ACHIEVEMENTS, FUTURE DIRECTIONS AND SOCIAL IMPACTS, 2006, : 265 - 267