A Comparison of Machine Learning Methods in a High-Dimensional Classification Problem

被引:8
|
作者
Zekic-Susac, Marijana [1 ]
Pfeifer, Sanja [1 ]
Sarlija, Natasa [1 ]
机构
[1] Univ Josip Juraj Strossmayer Osijek, Fac Econ, Osijek, Croatia
来源
BUSINESS SYSTEMS RESEARCH JOURNAL | 2014年 / 5卷 / 03期
关键词
machine learning; support vector machines; artificial neural networks; CART classification trees; k-nearest neighbour; large-dimensional data; cross-validation;
D O I
10.2478/bsrj-2014-0021
中图分类号
F [经济];
学科分类号
02 ;
摘要
Background: Large-dimensional data modelling often relies on variable reduction methods in the pre-processing and in the post-processing stage. However, such a reduction usually provides less information and yields a lower accuracy of the model. Objectives: The aim of this paper is to assess the high-dimensional classification problem of recognizing entrepreneurial intentions of students by machine learning methods. Methods/Approach: Four methods were tested: artificial neural networks, CART classification trees, support vector machines, and k-nearest neighbour on the same dataset in order to compare their efficiency in the sense of classification accuracy. The performance of each method was compared on ten subsamples in a 10-fold cross-validation procedure in order to assess computing sensitivity and specificity of each model. Results: The artificial neural network model based on multilayer perceptron yielded a higher classification rate than the models produced by other methods. The pairwise t-test showed a statistical significance between the artificial neural network and the k-nearest neighbour model, while the difference among other methods was not statistically significant. Conclusions: Tested machine learning methods are able to learn fast and achieve high classification accuracy. However, further advancement can be assured by testing a few additional methodological refinements in machine learning methods.
引用
收藏
页码:82 / 96
页数:15
相关论文
共 50 条
  • [1] PERFORMANCE OF MACHINE LEARNING METHODS IN CLASSIFICATION MODELS WITH HIGH-DIMENSIONAL DATA
    Zekic-Susac, Marijana
    Pfeifer, Sanja
    Sarlija, Natasa
    SOR'13 PROCEEDINGS: THE 12TH INTERNATIONAL SYMPOSIUM ON OPERATIONAL RESEARCH IN SLOVENIA, 2013, : 219 - 224
  • [2] Morphological classification of brains via high-dimensional shape transformations and machine learning methods
    Lao, ZQ
    Shen, DG
    Xue, Z
    Karacali, B
    Resnick, SM
    Davatzikos, C
    NEUROIMAGE, 2004, 21 (01) : 46 - 57
  • [3] A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction
    Spooner, Annette
    Chen, Emily
    Sowmya, Arcot
    Sachdev, Perminder
    Kochan, Nicole A.
    Trollor, Julian
    Brodaty, Henry
    SCIENTIFIC REPORTS, 2020, 10 (01)
  • [4] A comparison of machine learning methods for survival analysis of high-dimensional clinical data for dementia prediction
    Annette Spooner
    Emily Chen
    Arcot Sowmya
    Perminder Sachdev
    Nicole A. Kochan
    Julian Trollor
    Henry Brodaty
    Scientific Reports, 10
  • [5] Novel machine learning approach for classification of high-dimensional microarray data
    Musheer, Rabia Aziz
    Verma, C. K.
    Srivastava, Namita
    SOFT COMPUTING, 2019, 23 (24) : 13409 - 13421
  • [6] Novel machine learning approach for classification of high-dimensional microarray data
    Rabia Aziz Musheer
    C. K. Verma
    Namita Srivastava
    Soft Computing, 2019, 23 : 13409 - 13421
  • [7] Machine Learning Regularization Methods in High-Dimensional Monetary and Financial VARs
    Sanchez Garcia, Javier
    Cruz Rambaud, Salvador
    MATHEMATICS, 2022, 10 (06)
  • [8] Classification methods for high-dimensional genetic data
    Kalina, Jan
    BIOCYBERNETICS AND BIOMEDICAL ENGINEERING, 2014, 34 (01) : 10 - 18
  • [9] High-Dimensional Ensemble Learning Classification: An Ensemble Learning Classification Algorithm Based on High-Dimensional Feature Space Reconstruction
    Zhao, Miao
    Ye, Ning
    APPLIED SCIENCES-BASEL, 2024, 14 (05):
  • [10] Learning distance to subspace for the nearest subspace methods in high-dimensional data classification
    Zhu, Rui
    Dong, Mingzhi
    Xue, Jing-Hao
    INFORMATION SCIENCES, 2019, 481 : 69 - 80