Fast Dual Selection using Genetic Algorithms for Large Data Sets

被引:0
|
作者
Ros, Frederic [1 ]
Harba, Rachid [1 ]
Pintore, Marco [2 ]
机构
[1] Univ Orleans, Lab PRISME, Orleans, France
[2] PILA, Orleans, France
来源
2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA) | 2012年
关键词
instance and feature selection; scaling; genetic algorithms; supervised classification; k-nearest neighbors; NEAREST-NEIGHBOR RULE; LEARNING ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper is devoted to feature and instance selection managed by genetic algorithms (GA) in the context of supervised classification. We propose a GA encoded for selecting features in which each evaluated chromosome delivers a set of instances. The main aim is to optimize the processing time, which is particularly problematic when handling large databases. A key feature of our approach is the variable fitness evaluation based on scalability methodologies. Experimental results indicate that the preliminary version of the proposed algorithm can significantly reduce the computation time and is therefore applicable to high-dimensional data sets.
引用
收藏
页码:815 / 820
页数:6
相关论文
共 50 条
  • [1] Fast Scalable Selection Algorithms for Large Scale Data
    Thompson, Lee Parnell
    Xu, Weijia
    Miranker, Daniel P.
    2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [2] Fast algorithms for nonparametric population modeling of large data sets
    Pillonetto, Gianluigi
    De Nicolao, Giuseppe
    Chierici, Marco
    Cobelli, Claudio
    AUTOMATICA, 2009, 45 (01) : 173 - 179
  • [3] Fast Fitch-parsimony algorithms for large data sets
    Ronquist, F
    CLADISTICS-THE INTERNATIONAL JOURNAL OF THE WILLI HENNIG SOCIETY, 1998, 14 (04): : 387 - 400
  • [4] Efficient algorithms for fast integration on large data sets from multiple sources
    Mi, Tian
    Rajasekaran, Sanguthevar
    Aseltine, Robert
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2012, 12
  • [5] Efficient algorithms for fast integration on large data sets from multiple sources
    Tian Mi
    Sanguthevar Rajasekaran
    Robert Aseltine
    BMC Medical Informatics and Decision Making, 12
  • [6] Fast algorithms using minimal data structures for common topological relationships in large, irregularly spaced topographic data sets
    Meyer, Thomas H.
    COMPUTERS & GEOSCIENCES, 2007, 33 (03) : 325 - 334
  • [7] Engineering Algorithms for Large Data Sets
    Sanders, Peter
    SOFSEM 2013: Theory and Practice of Computer Science, 2013, 7741 : 29 - 32
  • [8] Parallel Distributed Genetic Rule Selection for Data Mining from Large Data Sets
    Nojima, Yusuke
    Mihara, Shingo
    Ishibuchi, Hisao
    SIMULATION AND MODELING RELATED TO COMPUTATIONAL SCIENCE AND ROBOTICS TECHNOLOGY, 2012, 37 : 140 - 154
  • [9] Multiclock selection and synthesis for CDFGs using optimal clock sets and genetic algorithms
    Torbey, E
    Knight, JP
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2001, 9 (05) : 599 - 607
  • [10] Empirical comparison of fast partitioning-based clustering algorithms for large data sets
    Wei, CP
    Lee, YH
    Hsu, CM
    EXPERT SYSTEMS WITH APPLICATIONS, 2003, 24 (04) : 351 - 363