Software reuse analytics using integrated random forest and gradient boosting machine learning algorithm

被引:27
作者
Sandhu, Amandeep Kaur [1 ]
Batth, Ranbir Singh [1 ]
机构
[1] Lovely Profess Univ, Sch Comp Sci & Engn, Phagwara, Punjab, India
关键词
AdaBoostM1; confusion matrix; DecisionStump; gradient boosting machine; J48; JRip; LMT; LogitBoost; one R; part; random forest; software metrics; software reuse; DATA MINING TECHNIQUES; MANAGEMENT;
D O I
10.1002/spe.2921
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
The term Cleaner Production (CP) for Production Companies is contemplated as influential to get sustainable production. CP mainly deals with three R's that is, reuse, reduce, and recycle. For software enterprise, the software reuse plays a pivotal role. Software reuse is a process of producing new products or software from the existing software by updating it. To extract useful information from the existing software data mining comes into light. The algorithms used for software reuse face issues related to maintenance cost, accuracy, and performance. Also, the currently used algorithm does not give accurate results on whether the component of software can be reused. Machine Learning gives the best results to predicate if the given software component is reusable or not. This paper introduces an integrated Random Forest and Gradient Boosting Machine Learning Algorithm (RFGBM) which test the reusability of the given software code considering the object-oriented parameters such as cohesion, coupling, cyclomatic complexity, bugs, number of children, and depth inheritance tree. Further, the proposed algorithm is compared with J48, AdaBoostM1, LogitBoost, Part, One R, LMT, JRip, DecisionStump algorithms. Performance metrices like accuracy, error rate, Relative Absolute Error, and Mean Absolute Error are improved using RFGBM. This algorithm also utilizes data preprocessing with the help of an unsupervised filter to remove the missing value for efficiency improvement. Proposed algorithm outperforms existing in term of performance parameters.
引用
收藏
页码:735 / 747
页数:13
相关论文
共 32 条
[1]   Component-based software engineering: Technologies, development frameworks, and quality assurance schemes [J].
Cai, X ;
Lyu, MR ;
Wong, KF ;
Ko, R .
SEVENTH ASIA-PACIFIC SOFTWARE ENGINEERING CONFERENCE, PROCEEDINGS, 2000, :372-379
[2]  
Cooper KML, 2009, WILEY ENCY COMPUTER, DOI [10.1002/9780470050118.ecse278, DOI 10.1002/9780470050118.ECSE278]
[3]  
Di Stefano JS, 2002, P 14 IEEE INT C TOOL
[4]  
Dwivedi AK, 2016, PROCEEDINGS OF THE 2016 IEEE REGION 10 CONFERENCE (TENCON), P222, DOI 10.1109/TENCON.2016.7847994
[5]  
Gupta DL, 2016, INT J CURR ENG TECHN, V6, P1728
[6]  
Iqbal MS., 2018, Iran J Comput Sci, V1, P31, DOI DOI 10.1007/S42044-017-0004-Z
[7]   Analysis of bioenergy by using linear regression [J].
Iqbal, Muhammad Shahid ;
Khan, Tamoor ;
Kausar, Samina ;
Bin, Luo .
SN APPLIED SCIENCES, 2019, 1 (10)
[8]   Efficient cell classification of mitochondrial images by using deep learning [J].
Iqbal, Muhammad Shahid ;
El-Ashram, Saeed ;
Hussain, Sajid ;
Khan, Tamoor ;
Huang, Shujian ;
Mehmood, Rashid ;
Luo, Bin .
JOURNAL OF OPTICS-INDIA, 2019, 48 (01) :113-122
[9]  
Kaur A., 2018, Int. J. Appl. Eng. Res., V13, P10005
[10]  
Kim Y, 1991, IS9 CTR DIS EC RES S