Fast Dual Selection using Genetic Algorithms for Large Data Sets

被引：0

作者：

Ros, Frederic ^{[1
]}

Harba, Rachid ^{[1
]}

Pintore, Marco ^{[2
]}

机构：

[1] Univ Orleans, Lab PRISME, Orleans, France

[2] PILA, Orleans, France

来源：

2012 12TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS DESIGN AND APPLICATIONS (ISDA) | 2012年

关键词：

instance and feature selection; scaling; genetic algorithms; supervised classification; k-nearest neighbors; NEAREST-NEIGHBOR RULE; LEARNING ALGORITHMS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper is devoted to feature and instance selection managed by genetic algorithms (GA) in the context of supervised classification. We propose a GA encoded for selecting features in which each evaluated chromosome delivers a set of instances. The main aim is to optimize the processing time, which is particularly problematic when handling large databases. A key feature of our approach is the variable fitness evaluation based on scalability methodologies. Experimental results indicate that the preliminary version of the proposed algorithm can significantly reduce the computation time and is therefore applicable to high-dimensional data sets.

引用

页码：815 / 820

页数：6

共 50 条

[1] Fast Scalable Selection Algorithms for Large Scale Data
Thompson, Lee Parnell
Xu, Weijia
Miranker, Daniel P.
2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
[2] Fast algorithms for nonparametric population modeling of large data sets
Pillonetto, Gianluigi
De Nicolao, Giuseppe
Chierici, Marco
Cobelli, Claudio
AUTOMATICA, 2009, 45 (01) : 173 - 179
[3] Fast Fitch-parsimony algorithms for large data sets
Ronquist, F
CLADISTICS-THE INTERNATIONAL JOURNAL OF THE WILLI HENNIG SOCIETY, 1998, 14 (04): : 387 - 400
[4] Efficient algorithms for fast integration on large data sets from multiple sources
Mi, Tian
Rajasekaran, Sanguthevar
Aseltine, Robert
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2012, 12
[5] Efficient algorithms for fast integration on large data sets from multiple sources
Tian Mi
Sanguthevar Rajasekaran
Robert Aseltine
BMC Medical Informatics and Decision Making, 12
[6] Fast algorithms using minimal data structures for common topological relationships in large, irregularly spaced topographic data sets
Meyer, Thomas H.
COMPUTERS & GEOSCIENCES, 2007, 33 (03) : 325 - 334
[7] Engineering Algorithms for Large Data Sets
Sanders, Peter
SOFSEM 2013: Theory and Practice of Computer Science, 2013, 7741 : 29 - 32
[8] Parallel Distributed Genetic Rule Selection for Data Mining from Large Data Sets
Nojima, Yusuke
Mihara, Shingo
Ishibuchi, Hisao
SIMULATION AND MODELING RELATED TO COMPUTATIONAL SCIENCE AND ROBOTICS TECHNOLOGY, 2012, 37 : 140 - 154
[9] Multiclock selection and synthesis for CDFGs using optimal clock sets and genetic algorithms
Torbey, E
Knight, JP
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2001, 9 (05) : 599 - 607
[10] Empirical comparison of fast partitioning-based clustering algorithms for large data sets
Wei, CP
Lee, YH
Hsu, CM
EXPERT SYSTEMS WITH APPLICATIONS, 2003, 24 (04) : 351 - 363

← 1 2 3 4 5 →