A Genetic Algorithm Based Modification on the LTS Algorithm for Large Data Sets

被引:1
|
作者
Satman, M. Hakan [1 ]
机构
[1] Istanbul Univ, Dept Econometr, TR-34 Istanbul, Turkey
关键词
C-steps; Genetic algorithms; Least trimmed squares regression; Outliers; Robust regression; TRIMMED SQUARES REGRESSION; OUTLIERS;
D O I
10.1080/03610918.2011.598989
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The authors introduce an algorithm for estimating the least trimmed squares (LTS) parameters in large data sets. The algorithm performs a genetic algorithm search to form a basic subset that is unlikely to contain outliers. Rousseeuw and van Driessen (2006) suggested drawing independent basic subsets and iterating C-steps many times to minimize LTS criterion. The authors 'algorithm constructs a genetic algorithm to form a basic subset and iterates C-steps to calculate the cost value of the LTS criterion. Genetic algorithms are successful methods for optimizing nonlinear objective functions but they are slower in many cases. The genetic algorithm configuration in the algorithm can be kept simple because a small number of observations are searched from the data. An R package is prepared to perform Monte Carlo simulations on the algorithm. Simulation results show that the performance of the algorithm is suitable for even large data sets because a small number of trials is always performed.
引用
收藏
页码:644 / 652
页数:9
相关论文
共 50 条
  • [31] ON K-MEDOID CLUSTERING OF LARGE DATA SETS WITH THE AID OF A GENETIC ALGORITHM - BACKGROUND, FEASIBILITY AND COMPARISON
    LUCASIUS, CB
    DANE, AD
    KATEMAN, G
    ANALYTICA CHIMICA ACTA, 1993, 282 (03) : 647 - 669
  • [32] GANY: A Genetic Spectral-based Clustering Algorithm for Large Data Analysis
    Menendez, Hector D.
    Camacho, David
    2015 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2015, : 640 - 647
  • [33] Epigenetic Modification of Genetic Algorithm
    Chrominski, Kornel
    Tkacz, Magdalena
    Boryczka, Mariusz
    COMPUTATIONAL SCIENCE - ICCS 2020, PT II, 2020, 12138 : 267 - 278
  • [34] Modification of Genetic Algorithm Based on Extinction Events and Migration
    Kieszek, Rafal
    Kachel, Stanislaw
    Kozakiewicz, Adam
    APPLIED SCIENCES-BASEL, 2023, 13 (09):
  • [35] Improved NLOS Error Mitigation Based on LTS Algorithm
    Khodjaev, Jasurbek
    Tedesco, Salvatore
    O'Flynn, Brendan
    PROGRESS IN ELECTROMAGNETICS RESEARCH LETTERS, 2016, 58 : 133 - 139
  • [36] A hypergraph based clustering algorithm for spatial data sets
    Cherng, JS
    Lo, MJ
    2001 IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS, 2001, : 83 - 90
  • [37] Clustering Based Bagging Algorithm on Imbalanced Data Sets
    Sun, Xiao-Yan
    Zhang, Hua-Xiang
    Wang, Zhi-Chao
    INTEGRATED UNCERTAINTY IN KNOWLEDGE MODELLING AND DECISION MAKING, 2011, 7027 : 179 - 186
  • [38] Data mining technology based on rough set and genetic algorithm under large data environment
    Wang, Liping
    PROCEEDINGS OF THE 2016 2ND INTERNATIONAL CONFERENCE ON MATERIALS ENGINEERING AND INFORMATION TECHNOLOGY APPLICATIONS (MEITA 2016), 2017, 107 : 561 - 565
  • [39] NUMERICAL OPTIMIZATION ALGORITHM BASED ON GENETIC ALGORITHM FOR A DATA COMPLETION PROBLEM
    Jouilik, B.
    Daoudi, J.
    Tajani, C.
    Abouchabaka, J.
    TWMS JOURNAL OF APPLIED AND ENGINEERING MATHEMATICS, 2023, 13 (01): : 86 - 97
  • [40] A Hybrid Algorithm for Satellite Data Transmission Schedule Based on Genetic Algorithm
    李云峰
    武小悦
    Journal of China Ordnance, 2008, (03) : 203 - 208