Enhancing Software Defect Prediction accuracy using Modified Entropy Calculation in Random Forest Algorithm

被引:0
|
作者
Suryawanshi, Ranjeetsingh [1 ]
Kadam, Amol [1 ]
机构
[1] Bharati Vidyapeeth Deemed Be Univ, Coll Engn, Pune, India
关键词
Random forest; decision tree; classification; prediction; entropy; Taylor series; NETWORKS;
D O I
10.52783/jes.754
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Imagine you are trying to classify software defect for a large dataset. How will you choose the best algorithm to do that? For the above problem we have various algorithms like Random Forest, Support Vector Machine, Neural Networks, Naive Bayes, K -Nearest Neighbours, Decision Tree, Logistic Regression etc. One of the most used methods is Random Forest algorithm, which uses multiple Decision Trees to make predictions. However, this algorithm relies on a complex calculation called Entropy, which measures the uncertainty in the data. Entropy function that uses natural logarithm which may be time consuming calculation. Is there a better way to calculate entropy? In this research, have explored a different way to calculate the natural logarithm using the Taylor series expression. It is a series consisting of sum of infinite terms that approximates any function by using its derivatives. We further modified the Random Forest algorithm by replacing the natural logarithm the Taylor series expression in the Entropy formula. We tested our modified algorithm on dataset and compared its performance with the original Entropy formula. We found that our modification in the algorithm has improved the accuracy of the algorithm on software defect prediction.
引用
收藏
页码:84 / 91
页数:8
相关论文
共 50 条
  • [31] Software Vulnerability Prediction Using Grey Wolf-Optimized Random Forest on the Unbalanced Data Sets
    Rhmann, Wasiur
    INTERNATIONAL JOURNAL OF APPLIED METAHEURISTIC COMPUTING, 2022, 13 (01)
  • [32] Prediction of Employee Turn Over Using Random Forest Classifier with Intensive Optimized Pca Algorithm
    Wild Ali, Alaeldeen Bader
    WIRELESS PERSONAL COMMUNICATIONS, 2021, 119 (04) : 3365 - 3382
  • [33] Prediction of Employee Turn Over Using Random Forest Classifier with Intensive Optimized Pca Algorithm
    Alaeldeen Bader Wild Ali
    Wireless Personal Communications, 2021, 119 : 3365 - 3382
  • [34] Software Defect Prediction Using Principal Component Analysis and Naive Bayes Algorithm
    Dhamayanthi, N.
    Lavanya, B.
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA ENGINEERING (ICCIDE 2018), 2019, 28 : 241 - 248
  • [35] Prediction of Incident Delirium Using a Random Forest classifier
    Corradi, John P.
    Thompson, Stephen
    Mather, Jeffrey F.
    Waszynski, Christine M.
    Dicks, Robert S.
    JOURNAL OF MEDICAL SYSTEMS, 2018, 42 (12)
  • [36] Prediction of Incident Delirium Using a Random Forest classifier
    John P. Corradi
    Stephen Thompson
    Jeffrey F. Mather
    Christine M. Waszynski
    Robert S. Dicks
    Journal of Medical Systems, 2018, 42
  • [37] Enhancing building energy efficiency using a random forest model: A hybrid prediction approach
    Liu, Yang
    Chen, Hongyu
    Zhang, Limao
    Feng, Zongbao
    ENERGY REPORTS, 2021, 7 : 5003 - 5012
  • [38] Heart Disease Prediction System Using Random Forest
    Singh, Yeshvendra K.
    Sinha, Nikhil
    Singh, Sanjay K.
    ADVANCES IN COMPUTING AND DATA SCIENCES, ICACDS 2016, 2017, 721 : 613 - 623
  • [39] Accident Prediction Accuracy Assessment for Highway-Rail Grade Crossings Using Random Forest Algorithm Compared with Decision Tree
    Zhou, Xiaoyi
    Lu, Pan
    Zheng, Zijian
    Tolliver, Denver
    Keramati, Amin
    RELIABILITY ENGINEERING & SYSTEM SAFETY, 2020, 200
  • [40] Using Random Forest Algorithm for Breast Cancer Diagnosis
    Dai, Bin
    Chen, Rung-Ching
    Zhu, Shun-Zhi
    Zhang, Wei-Wei
    2018 INTERNATIONAL SYMPOSIUM ON COMPUTER, CONSUMER AND CONTROL (IS3C 2018), 2018, : 449 - 452