Source code size prediction using use case metrics: an empirical comparison with use case points

被引:6
作者
Badri M. [1 ]
Badri L. [1 ]
Flageol W. [1 ]
Toure F. [1 ]
机构
[1] Software Engineering Research Laboratory, Department of Mathematics and Computer Science, University of Quebec, Trois-Rivières, QC
基金
加拿大自然科学与工程研究理事会;
关键词
C4.5; k-NN; Linear regression; Logistic regression; LOO cross validation; Multilayer perceptron neural network; Naïve Bayes; Prediction models; Random forest; ROC and AUC analysis; Source code size; Use case metrics; Use case points; Use cases;
D O I
10.1007/s11334-016-0285-7
中图分类号
学科分类号
摘要
Software source code size, in terms of source lines of code (SLOC), is an important parameter of many parametric software development effort estimation methods. In this paper, we investigate empirically the early prediction of SLOC for object-oriented software using use case metrics. We used different modeling techniques to build the prediction models. We used the univariate logistic regression and the simple linear regression methods to evaluate the individual effect of each use case metric on SLOC, and the multivariate logistic regression and the multiple linear regression methods to explore the combined effect of the use case metrics on SLOC. We also used in the study different machine learning methods (k-NN, naïve Bayes, C4.5, random forest, and multilayer perceptron neural network). The prediction models were evaluated using the receiver operating characteristic analysis, particularly the area under the curve measure, and leave-one-out cross validation. An empirical study, using data collected from five open source Java projects, is reported in the paper. The use case metrics have been compared to the well-known use case points method. Results provide evidence that the use case metrics-based approach gives a more accurate prediction of SLOC than the use case points-based approach. © 2016, Springer-Verlag London.
引用
收藏
页码:143 / 159
页数:16
相关论文
共 73 条
[1]  
Nassif A.B., Ho D., Capretz L.F., Towards an early software estimation using log-linear regression and a multilayer perceptron model, J Syst Softw, 86, 1, pp. 144-160, (2013)
[2]  
Ochodek M., Nawrocki J., Kwarciak K., Simplifying effort estimation based on use case points, Inf Softw Technol, 53, pp. 200-213, (2011)
[3]  
Lagerstrom R., von Wurtemberg L.M., Holm H., Luczak O., Identifying factors affecting software development cost and productivity, Softw Qual J, 20, 2, pp. 395-417, (2012)
[4]  
Zhou Y., Yang Y., Xu B., Leung H., Zhou X., Souce code size estimation approaches for object-oriented systems from UML class diagrams: a comparative study, Inf Softw Technol, 56, pp. 220-237, (2014)
[5]  
Jacobson I., Christerson M., Jonson P., Overgaard G., Object-oriented software engineering: a use case driven approach, (1993)
[6]  
Larman C., Applying UML and design patterns, an introduction to object-oriented analysis and design and the unified process, (2004)
[7]  
Karner G., Resource estimation for objectory projects, (1993)
[8]  
Anda B., Dreiem H., Sjoberg D.I.K., Jorgensen M., Estimating software development effort based on use cases: experiences from industry, UML 2001, LNCS, 2185, (2001)
[9]  
San Francisco, (2001)
[10]  
Mohagheghi P., Anda B., Conradi R., Effort estimation of use cases for incremental large-scale software development, Proceedings of the international conference on software engineering, ICSE’05, 15-21, (2005)