Improved Analogy-based Effort Estimation with Incomplete Mixed Data

被引:16
作者
Abnane, Ibtissam [1 ]
Idri, Ali [1 ]
机构
[1] Univ Mohammed 5, ENSIAS, Software Project Management Res Team, Rabat, Morocco
来源
PROCEEDINGS OF THE 2018 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS) | 2018年
关键词
Estimation by analogy; missing data; imputation; SOFTWARE PROJECT EFFORT; CONFIDENCE-INTERVALS; POWER; PREDICTION; MODELS; TESTS;
D O I
10.15439/2018F95
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Estimation by analogy (EBA) is one of the most attractive software effort development estimation techniques. However, one of the critical issues when using EBA is the occurrence of missing data (MD) in the historical data sets. The absence of values of several relevant software attributes is a frequent phenomenon that may cause inaccurate EBA estimations. The MD can be numerical and/or categorical. This paper evaluates four MD techniques (toleration, deletion, k-nearest neighbors (KNN) imputation and support vector regression (SVR) imputation) over four mixed data sets. A total of 432 experiments were conducted involving four MD techniques, nine MD percentages (from 10% to 90%), three missingness mechanisms (MCAR: Missing Completely at Random, MAR: Missing at Random and NIM: Non-Ignorable Missing) and four data sets. The evaluation process consists of four steps and uses several accuracy measures such as standardized accuracy (SA) and prediction level (Pred). The results suggest that EBA with imputation techniques achieved significantly better SA values over EBA with toleration or deletion regardless of the mechanism of missingness. Moreover, no particular MD imputation technique outperformed the other techniques overall. However, according to Pred and other accuracy criteria, EBA with SVR was the best, followed by KNN imputation; we also found that toleration instead of deletion improves the accuracy of EBA.
引用
收藏
页码:1015 / 1024
页数:10
相关论文
共 59 条
[1]  
Abdi H., 2010, Encyclopedia Res. Des., V169, P1, DOI DOI 10.4135/9781412961288.N178
[2]  
Abnane I., 2016, IEEE S SERIES COMPUT
[3]  
Amazal Fatima-Azzahra, 2014, 2014 Joint Conference of the International Workshop on Software Measurement and the International Conference on Software Process and Product Measurement. (IWSM-MENSURA). Proceedings, P252, DOI 10.1109/IWSM.Mensura.2014.31
[4]  
Amazal FA., 2014, 21 AS PAC SOFTW ENG, P1
[5]   SOFTWARE DEVELOPMENT EFFORT ESTIMATION USING CLASSICAL AND FUZZY ANALOGY: A CROSS-VALIDATION COMPARATIVE STUDY [J].
Amazal, Fatima Azzahra ;
Idri, Ali ;
Abran, Alain .
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2014, 13 (03)
[6]   A simulation tool for efficient analogy based cost estimation [J].
Angelis L. ;
Stamelos I. .
Empirical Software Engineering, 2000, 5 (1) :35-68
[7]  
[Anonymous], 2007, Missing data in clinical studies
[8]  
[Anonymous], 7 INT C SOFTW ENG AD
[9]  
[Anonymous], 2011, INT J BUS ADM
[10]  
[Anonymous], 1987, Statistical analysis with missing data