Incremental feature selection approach to multi-dimensional variation based on matrix dominance conditional entropy for ordered data set

被引:1
作者
Xu, Weihua [1 ]
Yang, Yifei [1 ]
Ding, Yi [1 ]
Chen, Xiyang [2 ]
Lv, Xiaofang [3 ]
机构
[1] Southwest Univ, Coll Artificial Intelligence, Chongqing 400715, Peoples R China
[2] Xian Univ Sci & Technol, Coll Comp Sci & Technol, Xian 710600, Peoples R China
[3] Southwest Univ, Coll Life Sci, Chongqing 400715, Peoples R China
基金
中国国家自然科学基金;
关键词
Conditional entropy; Dominance matrix; Feature selection; Ordered data set; Rough set; ATTRIBUTE REDUCTION; DYNAMIC DATA; LEARNING ALGORITHM; ROUGH SETS;
D O I
10.1007/s10489-024-05411-3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Rough set theory is a mathematical tool widely employed in various fields to handle uncertainty. Feature selection, as an essential and independent research area within rough set theory, aims to identify a small subset of important features by eliminating irrelevant, redundant, or noisy ones. In human life, data characteristics constantly change over time and other factors, resulting in ordered datasets with varying features. However, existing feature extraction methods are not suitable for handling such datasets since they do not consider previous reduction results when features change and need to be recomputed, leading to significant time consumption. To address this issue, the incremental attribute reduction algorithm utilizes prior reduction results effectively reducing computation time. Motivated by this approach, this paper investigates incremental feature selection algorithms for ordered datasets with changing features. Firstly, we discuss the dominant matrix and the dominance conditional entropy while introducing update principles for the new dominant matrix and dominance diagonal matrix when features change. Subsequently, we propose two incremental feature selection algorithms for adding (IFS-A) or deleting (IFS-D) features in ordered data set. Additionally, nine UCI datasets are utilized to evaluate the performance of our proposed algorithm. The experimental results validate that the average classification accuracy of IFS-A and IFS-D under four classifiers on twelve datasets is 82.05% and 80.75%, which increases by 5.48% and 3.68% respectively compared with the original data.
引用
收藏
页码:4890 / 4910
页数:21
相关论文
共 50 条
  • [41] Feature selection based on multi-perspective entropy of mixing uncertainty measure in variable-granularity rough set
    Jiucheng Xu
    Changshun Zhou
    Shihui Xu
    Lei Zhang
    Ziqin Han
    Applied Intelligence, 2024, 54 : 147 - 168
  • [42] Feature selection based on multi-perspective entropy of mixing uncertainty measure in variable-granularity rough set
    Xu, Jiucheng
    Zhou, Changshun
    Xu, Shihui
    Zhang, Lei
    Han, Ziqin
    APPLIED INTELLIGENCE, 2024, 54 (01) : 147 - 168
  • [43] ASFS: A novel streaming feature selection for multi-label data based on neighborhood rough set
    Liu, Jinghua
    Lin, Yaojin
    Du, Jixiang
    Zhang, Hongbo
    Chen, Ziyi
    Zhang, Jia
    APPLIED INTELLIGENCE, 2023, 53 (02) : 1707 - 1724
  • [44] BINARY PSO AND ROUGH SET THEORY FOR FEATURE SELECTION: A MULTI-OBJECTIVE FILTER BASED APPROACH
    Xue, Bing
    Cervante, Liam
    Shang, Lin
    Browne, Will
    Zhang, Mengjie
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2014, 13 (02)
  • [45] High Accuracy Data Classification and Feature Selection for Incomplete Information Systems Using Extended Limited Tolerance Relation and Conditional Entropy Approach
    Deris, Mustafa Mat
    Abawajy, Jemal H.
    Yanto, Iwan Tri Riyadi
    Adiwijaya, Adiwijaya
    Herawan, Tutut
    Rofiq, Ainur
    Efendi, Riswan
    Jaafar, Mohamad Jazli Shafizan
    IEEE ACCESS, 2025, 13 : 27657 - 27669
  • [46] Feature selection by utilizing kernel-based fuzzy rough set and entropy-based non-dominated sorting genetic algorithm in multi-label data
    Hamidzadeh, Javad
    Mehravaran, Zahra
    Harati, Ahad
    KNOWLEDGE AND INFORMATION SYSTEMS, 2025, : 3789 - 3819
  • [47] PSO Based Fast K-means Algorithm for Feature Selection from High Dimensional Medical data set
    Doreswamy
    Salma, Umme M.
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO'16), 2016,
  • [48] Feature selection for high-dimensional multi-category data using PLS-based local recursive feature elimination
    You, Wenjie
    Yang, Zijiang
    Ji, Guoli
    EXPERT SYSTEMS WITH APPLICATIONS, 2014, 41 (04) : 1463 - 1475
  • [49] An interactive feature selection method based on multi-step state transition algorithm for high-dimensional data
    Du, Yangyi
    Zhou, Xiaojun
    Yang, Chunhua
    Huang, Tingwen
    KNOWLEDGE-BASED SYSTEMS, 2023, 282
  • [50] A new population initialization of metaheuristic algorithms based on hybrid fuzzy rough set for high-dimensional gene data feature selection
    Guo, Xuanming
    Hu, Jiao
    Yu, Helong
    Wang, Mingjing
    Yang, Bo
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 166