An Algorithm for Cross-Dependent Feature Selection of Genetic Data

被引:0
作者
Zhang L. [1 ]
机构
[1] College of Computer Engineering, Jiangsu University of Technology, Changzhou
来源
Dianzi Keji Daxue Xuebao/Journal of the University of Electronic Science and Technology of China | 2022年 / 51卷 / 05期
关键词
Classification; Feature selection; Joint mutual information; Mutual information; Relevance;
D O I
10.12178/1001-0548.2021136
中图分类号
学科分类号
摘要
Feature selection is an essential step in the data preprocessing phase in the field of bioinformatics. Traditional feature selection algorithms ignore the problems of dependency relevance and redundancy between features. This paper proposes a joint feature relevance and redundancy (JFRR) algorithm for feature selection. The algorithm uses mutual information to calculate the redundancy values between features and applies joint mutual information to compute the relevance among the set of selected features, candidate features and class labels. Finally, JFRR is validated with the other six feature selection algorithms on two classifiers using nine different gene datasets with classification accuracy metrics (Precision_micro and F1_micro). The experimental results show that the JFRR method can effectively improve classification accuracy. © 2022, Editorial Board of Journal of the University of Electronic Science and Technology of China. All right reserved.
引用
收藏
页码:754 / 759
页数:5
相关论文
共 20 条
[1]  
DABBA A, ABDELKAMEL T, SAMY M, Et al., Gene selection and classification of microarray data method based on mutual information and moth flame algorithm, Expert Systems with Applications, 166, (2021)
[2]  
HAMBALI M A, OLADELE T O, ADEWOLE K S., Microarray cancer feature selection: Review, challenges and research directions, International Journal of Cognitive Computing in Engineering, 1, pp. 78-97, (2020)
[3]  
WANG X, HU X G., Overview on feature selection in high-dimensional and small-sample-size classification, Journal of Computer Applications, 37, 9, pp. 2433-2438, (2017)
[4]  
WANG X, LIU J, CHENG Y, Et al., Dual hypergraph regularized PCA for biclustering of tumor gene expression data, IEEE Transactions on Knowledge and Data Engineering, 31, 12, pp. 2292-2303, (2019)
[5]  
LIU H, GREGORY D., A semi-parallel framework for greedy information-theoretic feature selection, Information Sciences, 492, pp. 13-28, (2019)
[6]  
CAI J, LUO J W, WANG S L, Et al., Feature selection in machine learning: A new perspective, Neurocomputing, 300, pp. 70-79, (2018)
[7]  
LEE C Y, CAI J Y., LASSO variable selection in data envelopment analysis with small datasets, Omega, 91, (2020)
[8]  
GAO L Y, WU W G., Relevance assignation feature selection method based on mutual information for machine learning, Knowledge-Based Systems, 209, (2020)
[9]  
XIE J Y, WANG M Z, ZHOU Y, Et al., Differential expression gene selection algorithms for unbalanced gene datasets, Chinese Journal of Computers, 42, 6, pp. 1232-1251, (2019)
[10]  
MACEDO F, OLIVEIRA M R, PACHECO A, Et al., Theoretical foundations of forward feature selection methods based on mutual information, Neurocomputing, 325, pp. 67-89, (2019)