Prediction of defect severity by mining software project reports

被引:21
作者
Jindal R. [1 ]
Malhotra R. [1 ]
Jain A. [1 ]
机构
[1] Department of Computer Science Engineering, Delhi Technological University, New Delhi
关键词
Defect prediction; Empirical validation; InfoGain; Machine learning; Receiver operating characteristics; Statistical methods; Text mining;
D O I
10.1007/s13198-016-0438-y
中图分类号
学科分类号
摘要
With ever increasing demands from the software organizations, the rate of the defects being introduced in the software cannot be ignored. This has now become a serious cause of concern and must be dealt with seriously. Defects which creep into the software come with varying severity levels ranging from mild to catastrophic. The severity associated with each defect is the most critical aspect of the defect. In this paper, we intend to predict the models which will be used to assign an appropriate severity level (high, medium, low and very low) to the defects present in the defect reports. We have considered the defect reports from the public domain PITS dataset (PITS A, PITS C, PITS D and PITS E) which are being popularly used by NASA’s engineers. Extraction of the relevant data from the defect reports is accomplished by using text mining techniques and thereafter model prediction is carried out by using one statistical method i.e. Multi-nominal Multivariate Logistic Regression (MMLR) and two machine learning methods viz. Multi-layer Perceptron (MLP) and Decision Tree (DT). The performance of the models has been evaluated using receiver operating characteristics analysis and it was observed that the performance of DT model is the best as compared to the performance of MMLR and MLP models. © 2016, The Society for Reliability Engineering, Quality and Operations Management (SREQOM), India and The Division of Operation and Maintenance, Lulea University of Technology, Sweden.
引用
收藏
页码:334 / 351
页数:17
相关论文
共 29 条
  • [1] Aggarwal K.K., Singh Y., Kaur A., Malhotra R., Empirical analysis for investigating the effect of object-oriented metrics on fault proneness: a replicated case study, Softw Process Improve Practice, 16, 1, pp. 39-62, (2009)
  • [2] Canfora G., Cerulo L., How software repositories can help in resolving a new change request, (2005)
  • [3] Catal C., Diri B., A systematic review of software fault prediction studies, Expert Syst Appl, 36, pp. 7346-7354, (2009)
  • [4] Cubranic D., Murphy GC, (2004)
  • [5] Emam K.E., Melo W., The prediction of faulty classes using object-oriented design metrics, Technical Report NRC, (1999)
  • [6] Emam K.E., Benlarbi S., Goel N., Rai S., A validation of object-oriented metrics, NRC Technical report ERB, (1999)
  • [7] Gondra I., Applying machine learning to software fault-proneness prediction, J Syst Softw, 81, pp. 186-195, (2008)
  • [8] Gyimothy T., Ferenc R., Siket I., Empirical validation of object-oriented metrics on open source software for fault prediction, IEEE Trans Softw Eng, 31, 10, pp. 897-910, (2005)
  • [9] Hosmer D., Lemeshow S., Applied logistic regression, (1989)
  • [10] Ikonomakis M., Kotsiantis S., Tampakas V., Text classification using machine learning techniques, WSEAS Trans Comput, 4, 8, pp. 966-974, (2005)