A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem

被引:8
|
作者
Zekic-Susac, Marijana [1 ]
Pfeifer, Sanja [1 ]
Sarlija, Natasa [1 ]
机构
[1] Univ Josip Juraj Strossmayer Osijek, Fac Econ, Osijek, Croatia
来源
BUSINESS SYSTEMS RESEARCH JOURNAL | 2014年 / 5卷 / 03期
关键词
machine learning; support vector machines; artificial neural networks; CART classification trees; k-nearest neighbour; large-dimensional data; cross-validation;
D O I
10.2478/bsrj-2014-0021
中图分类号
F [经济];
学科分类号
02 ;
摘要
Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.
引用
收藏
页码:82 / 96
页数:15
相关论文
共 50 条
  • [21] An overview of modern machine learning methods for effect measure modification analyses in high-dimensional settings
    Cheung, Michael
    Dimitrova, Anna
    Benmarhnia, Tarik
    SSM-POPULATION HEALTH, 2025, 29
  • [22] Can We Train Machine Learning Methods to Outperform the High-dimensional Propensity Score Algorithm?
    Karim, Mohammad Ehsanul
    Pang, Menglan
    Platt, Robert W.
    EPIDEMIOLOGY, 2018, 29 (02) : 191 - 198
  • [23] Machine learning on high dimensional shape data from subcortical brain surfaces: A comparison of feature selection and classification methods
    Wade, Benjamin S. C.
    Joshi, Shantanu H.
    Gutman, Boris A.
    Thompson, Paul M.
    PATTERN RECOGNITION, 2017, 63 : 731 - 739
  • [24] INTERPRETABLE MACHINE LEARNING OF HIGH-DIMENSIONAL AGING HEALTH TRAJECTORIES
    Farrell, Spencer
    Mitnitski, Arnold
    Rockwood, Kenneth
    Rutenberg, Andrew
    INNOVATION IN AGING, 2021, 5 : 672 - 672
  • [25] Comparison of Machine Learning Methods in Classification of Affective Disorders
    Kinder, I
    Friganovic, K.
    Vukojevic, J.
    Mulc, D.
    Slukan, T.
    Vidovic, D.
    Brecic, P.
    Cifrek, M.
    2020 43RD INTERNATIONAL CONVENTION ON INFORMATION, COMMUNICATION AND ELECTRONIC TECHNOLOGY (MIPRO 2020), 2020, : 177 - 181
  • [26] Learning from High-Dimensional Data in Multitasli/Multilabel Classification
    Kwok, James T.
    2013 SECOND IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION (ACPR 2013), 2013, : 16 - 17
  • [27] A novel feature learning framework for high-dimensional data classification
    Yanxia Li
    Yi Chai
    Hongpeng Yin
    Bo Chen
    International Journal of Machine Learning and Cybernetics, 2021, 12 : 555 - 569
  • [28] High-dimensional role of Al and machine learning in cancer research
    Capobianco, Enrico
    BRITISH JOURNAL OF CANCER, 2022, 126 (04) : 523 - 532
  • [29] Exploring the robust extrapolation of high-dimensional machine learning potentials
    Zeni, Claudio
    Anelli, Andrea
    Glielmo, Aldo
    Rossi, Kevin
    PHYSICAL REVIEW B, 2022, 105 (16)
  • [30] A novel feature learning framework for high-dimensional data classification
    Li, Yanxia
    Chai, Yi
    Yin, Hongpeng
    Chen, Bo
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (02) : 555 - 569