Comparative study of regressor and classifier with decision tree using modern tools

被引:62
作者
Kushwah, Jitendra Singh [1 ]
Kumar, Atul [2 ]
Patel, Subhash [3 ]
Soni, Rishi [1 ]
Gawande, Amol [2 ]
Gupta, Shyam [1 ]
机构
[1] Inst Technol & Management, Dept Comp Sci & Engn, Gwalior, Madhya Pradesh, India
[2] Dr D Y Patil B Sch, Sr 87-88, Bengaluru-Mumbai Express Bypass, Tathawa, Pune, Maharashtra, India
[3] Vellore Inst Technol Bhopal Univ, Sch Comp Sci & Engn, Bhopal, Madhya Pradesh, India
关键词
Machine Learning; Decision Tree; Classification; Regression; MSE; RMSE; MAE;
D O I
10.1016/j.matpr.2021.11.635
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Machine Learning is one of the importantareas for modeling the data and itcan be saidthat this is the core part of the field of Data Science. Supervised Machine Learning (SML)has many algorithms to train the machine. The decision tree as the classifier is used to trainthe model based onthe categorical label and the Decision Tree as Regressor is used to trainthe model based ona non-categorical label. There are two kinds of algorithms like Classification and Regression. In this paper, we focus on the Decision Tree as a Regressor and Classifier and compare the metrics. This paper describes the decision tree with the analysis as well as a comparison with the most efficient algorithm based on the different datasets using python programming. Results show the accuracy and comparison of the decision tree as a Regressor and Classifier. Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Confusion Matrix are performance parametersused to analysis of decision treesand also using different python libraries to analyze and visualize the result.In this paper, we used a shopping mall dataset for classification as a case study from UCI (Machine Learning Repository) to predict that users purchase an item or not. This dataset contains 400records. Decision Tree as a Regressor is used as the dataset from the Kaggle repository for analysis and visualization of results and show comparison. In this paper, the Accuracy score is the most important measure to compare decision treesbased on regression and classification but Mean Squared Error (MSE) is also an important factor to decide and split the node into two or more nodes. Copyright (c) 2022 Elsevier Ltd. All rights reserved. Selection and peer-review under responsibility of the scientific committee of the First International Conference on Design and Materials (ICDM)-2021
引用
收藏
页码:3571 / 3576
页数:6
相关论文
共 10 条
[1]  
Analytics Vidhya, BEG GUID DEC TREE CL
[2]  
Badulescu L.A, 2007, ATTRIBUTE SELECTION, P1
[3]  
Celik O., 2018, Journal of Educational Technology and Online Learning, P25, DOI [10.31681/jetol.457046, DOI 10.31681/JETOL.457046]
[4]  
Demsar J, 2006, J MACH LEARN RES, V7, P1
[5]  
Gambhir Ekta, 2020, 2020 International Conference on Smart Electronics and Communication (ICOSEC), P65, DOI 10.1109/ICOSEC49089.2020.9215356
[6]  
Madhusekhar MY., 2020, INT J ADV SCI TECHNO, V29, P11528
[7]   Prediction performance of improved decision tree-based algorithms: a review [J].
Mienye, Ibomoiye Domor ;
Sun, Yanxia ;
Wang, Zenghui .
2ND INTERNATIONAL CONFERENCE ON SUSTAINABLE MATERIALS PROCESSING AND MANUFACTURING (SMPM 2019), 2019, 35 :698-703
[8]  
Shen Rong, 2018, MATEC Web of Conferences, V176, DOI 10.1051/matecconf/201817601033
[9]  
Towards Data Science, CLASSIFICATION REGRE
[10]  
Vapnik V.N, 2000, Technometrics, DOI [10.1007/978-1-4757-3264-1, 10.1080/00401706.1996.10484565, DOI 10.1080/00401706.1996.10484565]