An investigation of machine learning based prediction systems

被引:125
作者
Mair, C [1 ]
Kadoda, G [1 ]
Lefley, M [1 ]
Phalp, K [1 ]
Schofield, C [1 ]
Shepperd, M [1 ]
Webster, S [1 ]
机构
[1] Bournemouth Univ, Design Engn & Comp Dept, Empir Software Engn Res Grp, Poole BH12 5BB, Dorset, England
基金
英国工程与自然科学研究理事会;
关键词
machine learning; neural net; case-based reasoning; rule induction; software cost model; software effort estimation; prediction system;
D O I
10.1016/S0164-1212(00)00005-4
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Traditionally, researchers have used either off-the-shelf models such as COCOMO, or developed local models using statistical techniques such as stepwise regression, to obtain software effort estimates. More recently, attention has turned to a variety of machine learning methods such as artificial neural networks (ANNs), case-based reasoning (CBR) and rule induction (RI). This paper outlines some comparative research into the use of these three machine learning methods to build software effort prediction systems. We briefly describe each method and then apply the techniques to a dataset of 81 software projects derived from a Canadian software house in the late 1980s. We compare the prediction systems in terms of three factors: accuracy, explanatory value and configurability. We show that ANN methods have superior accuracy and that RI methods are least accurate. However, this view is somewhat counteracted by problems with explanatory value and configurability. For example, we found that considerable effort was required to configure the ANN and that this compared very unfavourably with the other techniques, particularly CBR and least squares regression (LSR). We suggest that Further work be carried out, both to further explore interaction between the end-user and the prediction system, and also to facilitate configuration, particularly of ANNs. (C) 2000 Elsevier Science Inc. All rights reserved.
引用
收藏
页码:23 / 29
页数:7
相关论文
共 26 条
[1]  
AARMODT A, 1994, AI COMMUNICATIONS, V7
[2]  
AHA WD, 1991, 1991 DARPA CASE BASE
[3]  
BISIO R, 1995, INT C CAS BAS REAS S
[4]  
Boehm B. W., 1981, SOFTWARE ENG EC
[5]  
Conte S.D., 1986, SOFTWARE ENG METRICS
[6]   Feature Subset Selection within a Simulated Annealing Data Mining Algorithm [J].
Debuse J.C.W. ;
Rayward-Smith V.J. .
Journal of Intelligent Information Systems, 1997, 9 (1) :57-81
[7]  
DESHARNAIS JM, 1989, UNPUB THESIS U MONTR
[8]   A comparison of techniques for developing predictive models of software metrics [J].
Gray, A ;
MacDonell, SG .
INFORMATION AND SOFTWARE TECHNOLOGY, 1997, 39 (06) :425-437
[9]   ANALOGY-BASED SOLUTION TO MARKUP ESTIMATION PROBLEM [J].
HEGAZY, T ;
MOSELHI, O .
JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 1994, 8 (01) :72-87
[10]   EXPERIENCE WITH THE ACCURACY OF SOFTWARE MAINTENANCE TASK EFFORT PREDICTION MODELS [J].
JORGENSEN, M .
IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1995, 21 (08) :674-681