Swarm Search Methods in Weka for Data Mining

被引:25
作者
Fong, Simon [1 ]
Biuk-Aghai, Robert P. [1 ]
Millham, Richard C. [2 ]
机构
[1] Univ Macau, Dept Comp & Informat Sci, Data Analyt & Collaborat Comp Lab, Taipa, Macau, Peoples R China
[2] Durban Univ Technol, Dept Informat Technol, ICT & Soc Res Grp, Durban, South Africa
来源
PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018) | 2018年
关键词
Data mining; search methods; feature selection; metaheuristics; OPTIMIZATION; ALGORITHM;
D O I
10.1145/3195106.3195167
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Building a good prediction from high-dimensional data model in data mining is a challenging endeavor. One key step in data preprocessing is feature selection (FS) which is about finding the right feature subset for effective supervised learning. FS has two parts: feature evaluators and search methods to find the appropriate features in the search space. In this paper we introduce a collection of search methods that implement metaheuristics search which is also known as swarm search (SS). SS has the advantage over conventional search such as local search, that SS has the facility to explore global optima by a group of autonomous search agents. We have recently added nine new methods to the Weka machine learning workbench. The objective of these nine swarm search methods is to supplement the existing search methods in Weka for providing efficient and effective FS in data mining. We have carried out two experiments using synthetic data and medical data. The results show that in general SS has certain advantages over the conventional search methods. The SS methods can be found in the Weka Package Manager as open source code. Researchers and Weka users are encouraged to enhance data mining performance using these free swarm search programs.
引用
收藏
页码:122 / 127
页数:6
相关论文
共 23 条
  • [1] [Anonymous], 2009, SIGKDD Explorations, DOI DOI 10.1145/1656274.1656278
  • [2] Blum C., 2007, HDB APPROXIMATION AL
  • [3] Deb S, 2015, 2015 TENTH INTERNATIONAL CONFERENCE ON DIGITAL INFORMATION MANAGEMENT (ICDIM), P249, DOI 10.1109/ICDIM.2015.7381893
  • [4] Ant system: Optimization by a colony of cooperating agents
    Dorigo, M
    Maniezzo, V
    Colorni, A
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1996, 26 (01): : 29 - 41
  • [5] Fayyad U., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P82
  • [6] A new heuristic optimization algorithm: Harmony search
    Geem, ZW
    Kim, JH
    Loganathan, GV
    [J]. SIMULATION, 2001, 76 (02) : 60 - 68
  • [7] Goldberg D. E, 1989, Genetic Algorithm in Search, Optimization and Machine Learning
  • [8] Benchmarking attribute selection techniques for discrete class data mining
    Hall, MA
    Holmes, G
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2003, 15 (06) : 1437 - 1447
  • [9] Tabu search for attribute reduction in rough set theory
    Hedar, Abdel-Rahman
    Wang, Jue
    Fukushima, Masao
    [J]. SOFT COMPUTING, 2008, 12 (09) : 909 - 918
  • [10] Huan Liu, 1996, Machine Learning. Proceedings of the Thirteenth International Conference (ICML '96), P319