Feature Selection Based on Difference and Similitude in Data Mining

被引:0
|
作者
WU Ming
机构
基金
中国国家自然科学基金;
关键词
knowledge reduction; feature selection; rough set; difference set; similitude set; attribute rank function;
D O I
暂无
中图分类号
TP311.13 [];
学科分类号
1201 ;
摘要
Feature selection is the pretreatment of data mining. Heuristic search algorithms are often used for this subject. Many heuristic search algorithms are based on discernibility matrices,which only consider the difference in information system. Because the similar characteristics are not revealed in discernibility matrix,the result may not be the simplest rules. Although difference similitude(DS) methods take both of the difference and the similitude into account,the existing search strategy will cause some important features to be ignored. An improved DS based algorithm is proposed to solve this problem in this paper. An attribute rank function,which considers both of the difference and similitude in feature selection,is defined in the improved algorithm. Experiments show that it is an effective algorithm,especially for large-scale databases. The time complexity of the algorithm is O (|C |2|U|2).
引用
收藏
页码:467 / 470
页数:4
相关论文
共 50 条
  • [21] Lightweight Feature Selection Methods Based on Standardized Measure of Dispersion for Mining Big Data
    Fong, Simon
    Biuk-Aghai, Robert P.
    Si, Yain-Whar
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2016, : 553 - 559
  • [22] Medical data mining in sentiment analysis based on optimized swarm search feature selection
    Zeng, Daohui
    Peng, Jidong
    Fong, Simon
    Qiu, Yining
    Wong, Raymond
    AUSTRALASIAN PHYSICAL & ENGINEERING SCIENCES IN MEDICINE, 2018, 41 (04) : 1087 - 1100
  • [23] Efficient genetic algorithm based data mining using feature selection with Hausdorff distance
    Sikora R.
    Piramuthu S.
    Information Technology and Management, 2005, 6 (4) : 315 - 331
  • [24] Medical data mining in sentiment analysis based on optimized swarm search feature selection
    Daohui Zeng
    Jidong Peng
    Simon Fong
    Yining Qiu
    Raymond Wong
    Australasian Physical & Engineering Sciences in Medicine, 2018, 41 : 1087 - 1100
  • [25] Metaheuristic and Data Mining Algorithms-based Feature Selection Approach for Anomaly Detection
    Nemati, Zahra
    Mohammadi, Ali
    Bayat, Ali
    Mirzaei, Abbas
    IETE JOURNAL OF RESEARCH, 2024, 70 (07) : 6040 - 6054
  • [26] Data filter function incremental mining based on feature selection in an active distribution network
    Deng, Song
    Cai, Qingyuan
    Zhang, Zi
    Yang, Lechan
    Huang, Tinglei
    Yuan, Changan
    IET CYBER-PHYSICAL SYSTEMS: THEORY & APPLICATIONS, 2020, 5 (03) : 301 - 309
  • [27] Feature Selection for Chiller Fault Detection and Diagnosis Based on Grey Similitude Degree of Incidence
    Liu, Hua
    Zhang, Zhiping
    Wang, Zhanwei
    JOURNAL OF GREY SYSTEM, 2020, 32 (02): : 136 - 149
  • [28] Opinion mining for biomedical text data: Feature space design and feature selection
    Dept. of Computer Science, San Francisco State University, San Francisco
    CA
    94132, United States
    Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., (117-124):
  • [29] Evaluating feature selection methods for learning in data mining applications
    Piramuthu, S
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2004, 156 (02) : 483 - 494
  • [30] Feature Selection with Optimal Stacked Sparse Autoencoder for Data Mining
    Hamza, Manar Ahmed
    Hassine, Siwar Ben Haj
    Abunadi, Ibrahim
    Al-Wesabi, Fahd N.
    Alsolai, Hadeel
    Hilal, Anwer Mustafa
    Yaseen, Ishfaq
    Motwakel, Abdelwahed
    CMC-COMPUTERS MATERIALS & CONTINUA, 2022, 72 (02): : 2581 - 2596