Active learning using uncertainty sampling and query-by-committee for software defect prediction

被引:1
作者
Qu Y. [1 ]
Chen X. [2 ]
Chen R. [3 ]
Ju X. [2 ]
Guo J. [1 ]
机构
[1] Jiangsu College of Engineering and Technology, Nantong
[2] Nantong University, Nantong
[3] Nanjing Foreign Language School, Nanjing
来源
International Journal of Performability Engineering | 2019年 / 15卷 / 10期
关键词
Active learning; Software defect prediction; Uncertainty sampling; Vote entropy;
D O I
10.23940/ijpe.19.10.p16.27012708
中图分类号
学科分类号
摘要
In the process of software defect prediction dataset construction, there are problems such as high labeling costs. Active learning can reduce labeling costs when using uncertainty sampling. Samples with the most uncertainty will be labeled, but samples with the highest certainty will always be discarded. According to cognitive theory, easy samples can promote the performance of the model. Therefore, a hybrid active learning query strategy is proposed. For the sample with lowest information entropy, query-by-committee will analyze it again using vote entropy. Empirical studies show that the proposed HIVE approach outperforms several state-of-the-art active learning approaches. © 2019 Totem Publisher, Inc. All rights reserved.
引用
收藏
页码:2701 / 2708
页数:7
相关论文
共 22 条
  • [1] Chen X., Gu Q., Liu W.S., Liu S.L., Ni C., State-of-the-art survey of static software defect prediction, Ruan Jian Xue Bao/Journal of Software, 27, 1, pp. 1-25, (2016)
  • [2] Settles B., Active learning literature survey, Computer Sciences Technical Report, pp. 12-25, (2010)
  • [3] Huang S.J., Jin R., Zhou Z.H., Active learning by querying informative and representative examples, IEEE Transactions on Pattern Analysis and Machine Intelligence, 36, 10, pp. 1936-1949, (2014)
  • [4] Li C.-L., Ferng C.-S., Lin H.-T., Active Learning using Hint Information, Neural Computation, 27, 8, pp. 1738-1765, (2015)
  • [5] Li M., Zhang H., Wu R., Sample-based Software Defect Prediction with Active and Semi-Supervised Learning, Automated Software Engineering, 19, 2, pp. 201-230, (2012)
  • [6] Luo G., Qin K., Active learning for software defect prediction, IEICE Transactions on Information and Systems, E95-D, 6, pp. 680-683, (2012)
  • [7] Lu H., Cukic B., An adaptive approach with active learning in software fault prediction, Proceedings of the 8th International Conference on Predictive Models in Software Engineering, pp. 79-88, (2012)
  • [8] Zhou X., Jin L., Luo X., Zhang T., Cross-version defect prediction via hybrid active learning with kernel principal component analysis, Proceedings of 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), pp. 209-220, (2018)
  • [9] Lu H., Kocaguneli E., Cukic B., Defect prediction between software versions with active learning and dimensionality reduction, Proceedings of the 2014 IEEE 25th International Symposium on Software Reliability Engineering, pp. 312-322, (2014)
  • [10] Wang K., Zhang D.Y., Li Y., Zhang R.M., Lin L., Cost-effective active learning for deep image classification, IEEE Transactions on Circuits and Systems for Video Technology, 27, 12, pp. 2591-2600, (2017)