Application of Ensemble Machine Learning for Classification Problems on Very Small Datasets

被引:0
|
作者
Pavic, Ognjen [1 ]
Dasic, Lazar [1 ]
Geroski, Tijana [2 ,3 ]
Pirkovic, Marijana Stanojevic [4 ]
Milovanovic, Aleksandar [1 ]
Filipovic, Nenad [2 ,3 ]
机构
[1] Univ Kragujevac, Inst Informat Technol, Kragujevac 34000, Serbia
[2] Univ Kragujevac, Fac Engn, Kragujevac 34000, Serbia
[3] Bioengn Res & Dev Ctr BioIRC, Kragujevac 34000, Serbia
[4] Univ Kragujevac, Fac Med Sci, Kragujevac 34000, Serbia
关键词
Machine learning; Classification; Risk assessment; Random forest; Ensemble First Section;
D O I
10.1007/978-3-031-60840-7_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning is one of the most widely used branches of artificial intelligence in recent years. It is most commonly used for solving classification or regression problems through the utilization of supervised learning approaches. Machine learning models require high quality and a sufficient quantity of data to produce good results. This paper investigates an approach which incorporates ensemble learning through the aggregation of multiple machine learning models for the purposes of increasing prediction capabilities in cases in which a very limited amount of data is available for training. The ensemble model was trained on a patient fractional flow reserve biomarker dataset and with the goal of classifying patients into risk classes based on their risk of suffering an acute myocardial infarction. The ensemble model was comprised of multiple random forest classification models which were trained with different combinations of training and test data to improve the prediction accuracy over the use of a single random forest model. Final ensemble achieved a prediction accuracy of 71.3% which was an immense improvement over the 36% prediction accuracy of a single random forest classification model.
引用
收藏
页码:108 / 115
页数:8
相关论文
共 50 条
  • [31] Simple Baseline Machine Learning Text Classifiers for Small Datasets
    Riekert M.
    Riekert M.
    Klein A.
    SN Computer Science, 2021, 2 (3)
  • [32] A strategy to apply machine learning to small datasets in materials science
    Zhang, Ying
    Ling, Chen
    NPJ COMPUTATIONAL MATERIALS, 2018, 4
  • [33] Nonlinear Programming for Classification Problems in Machine Learning
    Astorino, Annabella
    Fuduli, Antonio
    Gaudioso, Manlio
    NUMERICAL COMPUTATIONS: THEORY AND ALGORITHMS (NUMTA-2016), 2016, 1776
  • [34] Comparison of Machine Learning Algorithms for Classification Problems
    Sekeroglu, Boran
    Hasan, Shakar Sherwan
    Abdullah, Saman Mirza
    ADVANCES IN COMPUTER VISION, VOL 2, 2020, 944 : 491 - 499
  • [35] Deep Learning and Machine Learning Techniques Applied to Speaker Identification on Small Datasets
    Manfron, Enrico
    Teixeira, Joao Paulo
    Minetto, Rodrigo
    OPTIMIZATION, LEARNING ALGORITHMS AND APPLICATIONS, PT II, OL2A 2023, 2024, 1982 : 195 - 210
  • [36] Ensemble based reactivated regularization extreme learning machine for classification
    Zhang, Boyang
    Ma, Zhao
    Liu, Yingyi
    Yuan, Haiwen
    Sun, Lingjie
    NEUROCOMPUTING, 2018, 275 : 255 - 266
  • [37] Wheat Seed Classification: Utilizing Ensemble Machine Learning Approach
    Khatri, Ajay
    Agrawal, Shweta
    Chatterjee, Jyotir M.
    SCIENTIFIC PROGRAMMING, 2022, 2022
  • [38] An Ensemble Based Machine Learning Classification for Automated Glaucoma Detection
    Pawar, Digvijay J.
    Kanse, Yuvraj K.
    Patil, Suhas S.
    ADCAIJ-ADVANCES IN DISTRIBUTED COMPUTING AND ARTIFICIAL INTELLIGENCE JOURNAL, 2024, 13
  • [39] Ensemble Machine Learning Classification Models for Predicting Pavement Condition
    Chung, Frederick
    Doyle, Andy
    Robinson, Ernay
    Paik, Yejee
    Li, Mingshu
    Baek, Minsoo
    Moore, Brian
    Ashuri, Baabak
    TRANSPORTATION RESEARCH RECORD, 2024,
  • [40] Ensemble Machine Learning Model for Classification of Spam Product Reviews
    Fayaz, Muhammad
    Khan, Atif
    Rahman, Javid Ur
    Alharbi, Abdullah
    Uddin, M. Irfan
    Alouffi, Bader
    COMPLEXITY, 2020, 2020