Feature selection for software effort estimation with localized neighborhood mutual information

被引:11
作者
Liu, Qin [1 ]
Xiao, Jiakai [2 ]
Zhu, Hongming [1 ]
机构
[1] Tongji Univ, Sch Software Engn, Shanghai 201804, Peoples R China
[2] Tongji Univ, Dept Comp Sci & Technol, Shanghai 201804, Peoples R China
来源
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS | 2019年 / 22卷 / Suppl 3期
关键词
Feature selection; Case based reasoning; Neighborhood mutual information; Software effort estimation;
D O I
10.1007/s10586-018-1884-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Feature selection is usually employed before applying case based reasoning (CBR) for Software Effort Estimation (SEE). Unfortunately, most feature selection methods treat CBR as a black box method so there is no guarantee on the appropriateness of CBR on selected feature subset. The key to solve the problem is to measure the appropriateness of CBR assumption for a given feature set. In this paper, a measure called localized neighborhood mutual information (LNI) is proposed for this purpose and a greedy method called LNI based feature selection (LFS) is designed for feature selection. Experiment with leave-one-out cross validation (LOOCV) on 6 benchmark datasets demonstrates that: (1) CBR makes effective estimation with the LFS selected subset compared with a randomized baseline method. Compared with three representative feature selection methods, (2) LFS achieves optimal MAR value on 3 out of 6 datasets with a 14% average improvement and (3) LFS achieves optimal MMRE on 5 out of 6 datasets with a 24% average improvement.
引用
收藏
页码:S6953 / S6961
页数:9
相关论文
共 50 条
[31]   Mutual Information Estimation for Filter Based Feature Selection Using Particle Swarm Optimization [J].
Hoai Bach Nguyen ;
Xue, Bing ;
Andreae, Peter .
APPLICATIONS OF EVOLUTIONARY COMPUTATION, EVOAPPLICATIONS 2016, PT I, 2016, 9597 :719-736
[32]   GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation [J].
Oliveira, Adriano L. I. ;
Braga, Petronio L. ;
Lima, Ricardo M. F. ;
Cornelio, Marcio L. .
INFORMATION AND SOFTWARE TECHNOLOGY, 2010, 52 (11) :1155-1166
[33]   Towards effective feature selection in estimating software effort using machine learning [J].
Jadhav, Akshay ;
Kumar Shandilya, Shishir .
JOURNAL OF SOFTWARE-EVOLUTION AND PROCESS, 2024, 36 (05)
[34]   Feature Selection Based on Neighborhood Self-Information [J].
Wang, Changzhong ;
Huang, Yang ;
Shao, Mingwen ;
Hu, Qinghua ;
Chen, Degang .
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (09) :4031-4042
[35]   Nearest-Neighborhood Linear Regression in an Application with Software Effort Estimation [J].
Leal, Luciana Q. ;
Fagundes, Roberta A. A. ;
de Souza, Renata M. C. R. ;
Moura, Hermano P. ;
Gusmao, Cristine M. G. .
2009 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC 2009), VOLS 1-9, 2009, :5030-+
[36]   Selection of Measurements in Topology Estimation with Mutual Information [J].
Krstulovic, Jakov ;
Miranda, Vladimiro .
2014 IEEE INTERNATIONAL ENERGY CONFERENCE (ENERGYCON 2014), 2014, :589-596
[37]   Fuzzy C-means clustering-based multi-label feature selection via weighted neighborhood mutual information [J].
Sun, Lin ;
Guo, Jiaqi ;
Wu, Xuejiao ;
Xu, Jiucheng .
INFORMATION SCIENCES, 2025, 718
[38]   Mutual information for enhanced feature selection in visual tracking [J].
Stamatescu, Victor ;
Wong, Sebastien ;
Kearney, David ;
Lee, Ivan ;
Milton, Anthony .
AUTOMATIC TARGET RECOGNITION XXV, 2015, 9476
[39]   A review of feature selection methods based on mutual information [J].
Jorge R. Vergara ;
Pablo A. Estévez .
Neural Computing and Applications, 2014, 24 :175-186
[40]   A Powerful Feature Selection approach based on Mutual Information [J].
El Akadi, Ali ;
El Ouardighi, Abdeljalil ;
Aboutajdine, Driss .
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2008, 8 (04) :116-121