Comparison of the decision tree, artificial neural network, and linear regression methods based on the number and types of independent variables and sample size

被引:72
|
作者
Kim, Yong Soo [1 ]
机构
[1] SK Telecom, CI Div, Seoul 100999, South Korea
关键词
data mining; statistical method; artificial neural network; decision tree; linear regression;
D O I
10.1016/j.eswa.2006.12.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this article, the performance of data mining and statistical techniques was empirically compared while varying the number of independent variables, the types of independent variables, the number of classes of the independent variables, and the sample size. Our study employed 60 simulated examples, with artificial neural networks and decision trees as the data mining techniques, and linear regression as the statistical method. In the performance study, we use the RMSE value as the metric and come up with some additional findings: (i) for continuous independent variables, a statistical technique (i.e., linear regression) was superior to data mining (i.e., decision tree and artificial neural network) regardless of the number of variables and the sample size; (ii) for continuous and categorical independent variables, linear regression was best when the number of categorical variables was one, while the artificial neural network was superior when the number of categorical variables was two or more; (iii) the artificial neural network performance improved faster than that of the other methods as the number of classes of categorical variable increased. (C) 2006 Elsevier Ltd. All rights reserved.
引用
收藏
页码:1227 / 1234
页数:8
相关论文
共 50 条
  • [31] Comparison of Artificial Neural Network and Regression Models for Filling Temporal Gaps of Meteorological Variables Time Series
    Dyukarev, Egor
    APPLIED SCIENCES-BASEL, 2023, 13 (04):
  • [32] Comparison of migration modeling in micellar electrokinetic chromatography by linear regression and by use of an artificial neural network
    Metting, HJ
    van Zomeren, PV
    van der Ley, CP
    Coenegracht, PMJ
    de Jong, GJ
    CHROMATOGRAPHIA, 2000, 52 (9-10) : 607 - 613
  • [33] Comparison of multiple linear regression and artificial neural network in developing the objective functions of the orthopaedic screws
    Hsu, Ching-Chi
    Lin, Jinn
    Chao, Ching-Kong
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2011, 104 (03) : 341 - 348
  • [34] Prediction of Engineering Students' Academic Performance Using Artificial Neural Network and Linear Regression: A Comparison
    Arsad, Pauziah Mohd
    Buniyamin, Norlida
    Ab Manan, Jamalul-lail
    2013 IEEE 5TH INTERNATIONAL CONFERENCE ON ENGINEERING EDUCATION (ICEED): ALIGNING ENGINEERING EDUCATION WITH INDUSTRIAL NEEDS FOR NATION DEVELOPMENT, 2013, : 43 - +
  • [35] Comparison of migration modeling in micellar electrokinetic chromatography by linear regression and by use of an artificial neural network
    H. J. Metting
    P. V. van Zomeren
    C. P. van der Ley
    P. M. J. Coenegracht
    G. J. de Jong
    Chromatographia, 2000, 52 : 607 - 613
  • [36] An Empirical Comparison of Multiple Linear Regression and Artificial Neural Network for Concrete Dam Deformation Modelling
    Li, Mingjun
    Wang, Junxing
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019
  • [37] Comparison of artificial neural network and multivariate regression methods in prediction of soil Cation Exchange Capacity
    Keshavarzi, Ali
    Sarmadian, Fereydoon
    World Academy of Science, Engineering and Technology, 2010, 72 : 495 - 500
  • [38] Comparison of machine learning methods for the detection of focal cortical dysplasia lesions: decision tree, support vector machine and artificial neural network
    Ganji, Zohreh
    Hakak, Mohsen Aghaee
    Zare, Hoda
    NEUROLOGICAL RESEARCH, 2022, 44 (12) : 1142 - 1149
  • [39] Application of M5 tree regression, MARS, and artificial neural network methods to predict the Nusselt number and output temperature of CuO based nanofluid flows in a car radiator
    Kahani, Mostafa
    Ghazvini, Mahyar
    Mohseni-Gharyehsafa, Behnam
    Ahmadi, Mohammad Hossein
    Pourfarhang, Amin
    Shokrgozar, Motahareh
    Heris, Saeed Zeinali
    INTERNATIONAL COMMUNICATIONS IN HEAT AND MASS TRANSFER, 2020, 116 (116)
  • [40] A Comparison of Artificial Neural Network and Decision Trees with Logistic Regression as Classification Models for Breast Cancer Survival
    Mudunuru, Venkateswara Rao
    Skrzypek, Leslaw A.
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2020, 5 (06) : 1170 - 1190