Application of Ensemble Machine Learning for Classification Problems on Very Small Datasets

被引:0
|
作者
Pavic, Ognjen [1 ]
Dasic, Lazar [1 ]
Geroski, Tijana [2 ,3 ]
Pirkovic, Marijana Stanojevic [4 ]
Milovanovic, Aleksandar [1 ]
Filipovic, Nenad [2 ,3 ]
机构
[1] Univ Kragujevac, Inst Informat Technol, Kragujevac 34000, Serbia
[2] Univ Kragujevac, Fac Engn, Kragujevac 34000, Serbia
[3] Bioengn Res & Dev Ctr BioIRC, Kragujevac 34000, Serbia
[4] Univ Kragujevac, Fac Med Sci, Kragujevac 34000, Serbia
关键词
Machine learning; Classification; Risk assessment; Random forest; Ensemble First Section;
D O I
10.1007/978-3-031-60840-7_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning is one of the most widely used branches of artificial intelligence in recent years. It is most commonly used for solving classification or regression problems through the utilization of supervised learning approaches. Machine learning models require high quality and a sufficient quantity of data to produce good results. This paper investigates an approach which incorporates ensemble learning through the aggregation of multiple machine learning models for the purposes of increasing prediction capabilities in cases in which a very limited amount of data is available for training. The ensemble model was trained on a patient fractional flow reserve biomarker dataset and with the goal of classifying patients into risk classes based on their risk of suffering an acute myocardial infarction. The ensemble model was comprised of multiple random forest classification models which were trained with different combinations of training and test data to improve the prediction accuracy over the use of a single random forest model. Final ensemble achieved a prediction accuracy of 71.3% which was an immense improvement over the 36% prediction accuracy of a single random forest classification model.
引用
收藏
页码:108 / 115
页数:8
相关论文
共 50 条
  • [21] An investigation into the application of ensemble learning for entailment classification
    Rooney, Niall
    Wang, Hui
    Taylor, Philip S.
    INFORMATION PROCESSING & MANAGEMENT, 2014, 50 (01) : 87 - 103
  • [22] Spectral methods in machine learning and new strategies for very large datasets
    Belabbas, Mohamed-Ali
    Wolfe, Patrick J.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2009, 106 (02) : 369 - 374
  • [23] Guidelines to Select Machine Learning Scheme for Classification of Biomedical Datasets
    Tanwani, Ajay Kumar
    Afridi, Jamal
    Shafiq, M. Zubair
    Farooq, Muddassar
    EVOLUTIONARY COMPUTATION, MACHINE LEARNING AND DATA MINING IN BIOINFORMATICS, PROCEEDINGS, 2009, 5483 : 128 - 139
  • [24] Machine Learning-based Classification of Online Industrial Datasets
    Faber, Rastislav
    L'ubusky, Karol
    Paulen, Radoslav
    2023 24TH INTERNATIONAL CONFERENCE ON PROCESS CONTROL, PC, 2023, : 132 - 137
  • [25] A Framework for Effective Application of Machine Learning to Microbiome-Based Classification Problems
    Topcuoglu, Begum D.
    Lesniak, Nicholas A.
    Ruffin, Mack T.
    Wiens, Jenna
    Schlossa, Patrick D.
    MBIO, 2020, 11 (03):
  • [26] Review of Immunotherapy Classification: Application Domains, Datasets, Algorithms and Software Tools from Machine Learning Perspective
    Mahmoud, Ahsanullah Yunas
    Neagu, Daniel
    Scrimieri, Daniele
    Abdullatif, Amr Rashad Ahmed
    2022 32ND CONFERENCE OF OPEN INNOVATIONS ASSOCIATION (FRUCT), 2022, : 152 - 161
  • [27] Fuzzy cognitive map ensemble learning paradigm to solve classification problems: Application to autism identification
    Papageorgiou, Elpiniki I.
    Kannappan, Arthi
    APPLIED SOFT COMPUTING, 2012, 12 (12) : 3798 - 3809
  • [28] Application of Machine Learning in Discharge Classification
    Brar, Ramanpreet K.
    El-Hag, Ayman H.
    2020 IEEE CONFERENCE ON ELECTRICAL INSULATION AND DIELECTRIC PHENOMENA (2020 IEEE CEIDP), 2020, : 43 - 46
  • [29] A strategy to apply machine learning to small datasets in materials science
    Ying Zhang
    Chen Ling
    npj Computational Materials, 4
  • [30] Decomposition Methods for Machine Learning with Small, Incomplete or Noisy Datasets
    Caiafa, Cesar Federico
    Sole-Casals, Jordi
    Marti-Puig, Pere
    Zhe, Sun
    Tanaka, Toshihisa
    APPLIED SCIENCES-BASEL, 2020, 10 (23): : 1 - 20