Feature Selection Based on Difference and Similitude in Data Mining

被引:0
|
作者
WU Ming
机构
基金
中国国家自然科学基金;
关键词
knowledge reduction; feature selection; rough set; difference set; similitude set; attribute rank function;
D O I
暂无
中图分类号
TP311.13 [];
学科分类号
1201 ;
摘要
Feature selection is the pretreatment of data mining. Heuristic search algorithms are often used for this subject. Many heuristic search algorithms are based on discernibility matrices,which only consider the difference in information system. Because the similar characteristics are not revealed in discernibility matrix,the result may not be the simplest rules. Although difference similitude(DS) methods take both of the difference and the similitude into account,the existing search strategy will cause some important features to be ignored. An improved DS based algorithm is proposed to solve this problem in this paper. An attribute rank function,which considers both of the difference and similitude in feature selection,is defined in the improved algorithm. Experiments show that it is an effective algorithm,especially for large-scale databases. The time complexity of the algorithm is O (|C |2|U|2).
引用
收藏
页码:467 / 470
页数:4
相关论文
共 50 条
  • [1] Efficient Feature Selection Algorithm Based on Difference and Similitude Matrix
    Wu, Weibing
    Xu, Zhangyan
    Liu, June
    SIXTH INTERNATIONAL SYMPOSIUM ON NEURAL NETWORKS (ISNN 2009), 2009, 56 : 143 - +
  • [2] Feature Selection and Extraction in Data mining
    Aparna, U. R.
    Paul, Shaiju
    PROCEEDINGS OF 2016 ONLINE INTERNATIONAL CONFERENCE ON GREEN ENGINEERING AND TECHNOLOGIES (IC-GET), 2016,
  • [3] Input feature selection for sensor data mining based on mutual information
    Huang, J. J.
    Cai, Y. Z.
    Xu, X. M.
    DYNAMICS OF CONTINUOUS DISCRETE AND IMPULSIVE SYSTEMS-SERIES B-APPLICATIONS & ALGORITHMS, 2006, 13E : 1203 - 1208
  • [4] Credit Risk Evaluation Based on Data Mining and Integrated Feature Selection
    Deng, Yuanjie
    Wei, Ying
    Li, Yujun
    2020 IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING, COMMUNICATIONS AND COMPUTING (IEEE ICSPCC 2020), 2020,
  • [5] Framework for efficient feature selection in genetic algorithm based data mining
    Sikora, Riyaz
    Piramuthu, Selwyn
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2007, 180 (02) : 723 - 737
  • [6] Differentially Private Feature Selection for Data Mining
    Anandan, Balamurugan
    Clifton, Chris
    IWSPA '18: PROCEEDINGS OF THE FOURTH ACM INTERNATIONAL WORKSHOP ON SECURITY AND PRIVACY ANALYTICS, 2018, : 43 - 53
  • [7] Data mining for feature selection in gene expression autism data
    Latkowski, Tomasz
    Osowski, Stanislaw
    EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (02) : 864 - 872
  • [8] A Literature Review of Feature Selection Techniques and Applications Review of feature selection in data mining
    Visalakshi, S.
    Radha, V.
    2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND COMPUTING RESEARCH (IEEE ICCIC), 2014, : 966 - 971
  • [9] The use of feature selection based data mining methods in biomarkers identification of disease
    Zhao, Huihui
    Chen, Jianxin
    Liu, Y.
    Shi, Qi
    Yang, Yi
    Zheng, Chenglong
    Hou, Na
    Wang, Juan
    Zhao, Lingyan
    Wang, Wei
    CEIS 2011, 2011, 15
  • [10] Benchmarking relief-based feature selection methods for bioinformatics data mining
    Urbanowicz, Ryan J.
    Olson, Randal S.
    Schmit, Peter
    Meeker, Melissa
    Moore, Jason H.
    JOURNAL OF BIOMEDICAL INFORMATICS, 2018, 85 : 168 - 188