共 54 条
A Fast Hybrid Feature Selection Method Based on Dynamic Clustering and Improved Particle Swarm Optimization for High-Dimensional Health Care Data
被引:3
作者:
Kang, Yan
[1
]
Peng, Luhan
[1
]
Guo, Jing
[1
]
Lu, Yuhuan
[2
]
Yang, Yun
[1
]
Fan, Baochen
[1
]
Pu, Bin
[2
]
机构:
[1] Yunnan Univ, Natl Pilot Sch Software, Yunnan Key Lab Software Engn, Kunming 650106, Peoples R China
[2] Hunan Univ, Coll Comp Sci & Elect Engn, Changsha 410082, Peoples R China
基金:
中国国家自然科学基金;
关键词:
Heuristic algorithms;
Clustering algorithms;
Feature extraction;
Medical services;
Filtering algorithms;
Classification algorithms;
Biomedical monitoring;
Feature selection;
health care data;
high-dimensional data;
correlation-guided clustering;
particle swarm optimization;
CLASSIFICATION;
COLONY;
D O I:
10.1109/TCE.2023.3334373
中图分类号:
TM [电工技术];
TN [电子技术、通信技术];
学科分类号:
0808 ;
0809 ;
摘要:
The ubiquity and commoditization of wearable sensors have generated a deluge of user-generated health care data and played a key role in clinical utility, particularly when incorporated into personalized prediction models. The "curse of dimensionality" and enormous computational costs are still the main challenges faced by the existing algorithms as the number of wearable datasets exponentially increases. We propose a novel method by hybridizing a clustering method and a wrapper method to reduce the dimensionality of raw wearable datasets while preserving health care information. In the clustering stage, a dynamic correlation-guided feature clustering method reduces the search space by designing a dynamic threshold to filter unrelated high-dimensional features. In the wrapper stage, we obtain the optimal feature subset by improving the powerful search capability of the particle swarm optimization algorithm. A crossover operator based on normalized mutual information similarity is proposed to match particles, which effectively improves the diversity of the offspring swarm to prevent premature convergence. In addition, we propose a dynamic swarm strategy to mutate the duplicate particles in the swarm to enhance the efficiency of the particle search process. Our method is evaluated on ten real public datasets, and the experimental results demonstrate its superior performance.
引用
收藏
页码:2447 / 2459
页数:13
相关论文