Feature Selection Based on Difference and Similitude in Data Mining

被引:0
|
作者
WU Ming
机构
基金
中国国家自然科学基金;
关键词
knowledge reduction; feature selection; rough set; difference set; similitude set; attribute rank function;
D O I
暂无
中图分类号
TP311.13 [];
学科分类号
1201 ;
摘要
Feature selection is the pretreatment of data mining. Heuristic search algorithms are often used for this subject. Many heuristic search algorithms are based on discernibility matrices,which only consider the difference in information system. Because the similar characteristics are not revealed in discernibility matrix,the result may not be the simplest rules. Although difference similitude(DS) methods take both of the difference and the similitude into account,the existing search strategy will cause some important features to be ignored. An improved DS based algorithm is proposed to solve this problem in this paper. An attribute rank function,which considers both of the difference and similitude in feature selection,is defined in the improved algorithm. Experiments show that it is an effective algorithm,especially for large-scale databases. The time complexity of the algorithm is O (|C |2|U|2).
引用
收藏
页码:467 / 470
页数:4
相关论文
共 50 条
  • [41] A Perturbation Method Based on Singular Value Decomposition and Feature Selection for Privacy Preserving Data Mining
    Keyvanpour, Mohammad Reza
    Moradi, Somayyeh Seifi
    INTERNATIONAL JOURNAL OF DATA WAREHOUSING AND MINING, 2014, 10 (01) : 55 - 76
  • [42] Feature selection based on closed frequent itemset mining: A case study on SAGE data classification
    Seeja, K. R.
    NEUROCOMPUTING, 2015, 151 : 1027 - 1032
  • [43] Privacy preserving for feature selection in data mining using Centralized Network
    Bhuyan, Hemanta Kumar
    Mohanty, Maitri
    Das, Smruti Rekha
    International Journal of Computer Science Issues, 2012, 9 (3 3-2): : 434 - 440
  • [44] Data Mining via Discretization, Generalization and Rough Set Feature Selection
    Hu, Xiaohua
    Cercone, Nick
    Knowledge and Information Systems, 1999, 1 (01): : 33 - 60
  • [45] Feature Subset Selection within a Simulated Annealing Data Mining Algorithm
    Debuse J.C.W.
    Rayward-Smith V.J.
    Journal of Intelligent Information Systems, 1997, 9 (1) : 57 - 81
  • [46] Fast Feature Selection for Naive Bayes Classification in Data Stream Mining
    Lutu, Patricia E. N.
    WORLD CONGRESS ON ENGINEERING - WCE 2013, VOL III, 2013, : 1549 - 1554
  • [47] Cascading GA & CFS for Feature Subset selection in Medical Data Mining
    Karegowda, Asha Gowda
    Jayaram, M. A.
    2009 IEEE INTERNATIONAL ADVANCE COMPUTING CONFERENCE, VOLS 1-3, 2009, : 1428 - 1431
  • [48] Application of Data Mining Algorithms for Feature Selection and Prediction of Diabetic Retinopathy
    Oladele, Tinuke O.
    Ogundokun, Roseline Oluwaseun
    Kayode, Aderonke Anthonia
    Adegun, Adekanmi Adeyinka
    Adebiyi, Marion Oluwabunmi
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2019, PT V: 19TH INTERNATIONAL CONFERENCE, SAINT PETERSBURG, RUSSIA, JULY 14, 2019, PROCEEDINGS, PART V, 2019, 11623 : 716 - 730
  • [49] Data Mining via Discretization, Generalization and Rough Set Feature Selection
    Xiaohua Hu
    Nick Cercone
    Knowledge and Information Systems, 1999, 1 (1) : 33 - 60
  • [50] Design and Implementation of ACO Feature Selection Algorithm for Data Stream Mining
    Harde, Shivani
    Sahare, Vaishali
    2016 INTERNATIONAL CONFERENCE ON AUTOMATIC CONTROL AND DYNAMIC OPTIMIZATION TECHNIQUES (ICACDOT), 2016, : 1047 - 1051