Application of Ensemble Machine Learning for Classification Problems on Very Small Datasets

被引:0
|
作者
Pavic, Ognjen [1 ]
Dasic, Lazar [1 ]
Geroski, Tijana [2 ,3 ]
Pirkovic, Marijana Stanojevic [4 ]
Milovanovic, Aleksandar [1 ]
Filipovic, Nenad [2 ,3 ]
机构
[1] Univ Kragujevac, Inst Informat Technol, Kragujevac 34000, Serbia
[2] Univ Kragujevac, Fac Engn, Kragujevac 34000, Serbia
[3] Bioengn Res & Dev Ctr BioIRC, Kragujevac 34000, Serbia
[4] Univ Kragujevac, Fac Med Sci, Kragujevac 34000, Serbia
关键词
Machine learning; Classification; Risk assessment; Random forest; Ensemble First Section;
D O I
10.1007/978-3-031-60840-7_15
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine learning is one of the most widely used branches of artificial intelligence in recent years. It is most commonly used for solving classification or regression problems through the utilization of supervised learning approaches. Machine learning models require high quality and a sufficient quantity of data to produce good results. This paper investigates an approach which incorporates ensemble learning through the aggregation of multiple machine learning models for the purposes of increasing prediction capabilities in cases in which a very limited amount of data is available for training. The ensemble model was trained on a patient fractional flow reserve biomarker dataset and with the goal of classifying patients into risk classes based on their risk of suffering an acute myocardial infarction. The ensemble model was comprised of multiple random forest classification models which were trained with different combinations of training and test data to improve the prediction accuracy over the use of a single random forest model. Final ensemble achieved a prediction accuracy of 71.3% which was an immense improvement over the 36% prediction accuracy of a single random forest classification model.
引用
收藏
页码:108 / 115
页数:8
相关论文
共 50 条
  • [1] Applying machine learning methods toward classification based on small datasets: Application to shoulder labral tears
    Clymer, Daniel R.
    Long, Jason
    Latona, Carmen
    Akhavan, Sam
    Le Duc, Philip
    Cagan, Jonathan
    Journal of Engineering and Science in Medical Diagnostics and Therapy, 2020, 3 (01):
  • [2] Application of Machine Learning Models for Malware Classification With Real and Synthetic Datasets
    Joshi, Santosh
    Pons, Alexander Perez
    Kulkarni, Shrirang Ambaji
    Upadhyay, Himanshu
    INTERNATIONAL JOURNAL OF INFORMATION SECURITY AND PRIVACY, 2024, 18 (01)
  • [3] Ensemble Learning-based Traffic Classification with Small-Scale Datasets for Wireless Networks
    Wang, Xiaorong
    Wei, Wenting
    Yu, Xiaoshan
    Zheng, Danyang
    Kumar, Neeraj
    Liu, Lei
    IEEE INFOCOM 2024-IEEE CONFERENCE ON COMPUTER COMMUNICATIONS WORKSHOPS, INFOCOM WKSHPS 2024, 2024,
  • [4] Ensemble Deep Learning on Wearables Using Small Datasets
    Mauldin T.
    Ngu A.H.
    Metsis V.
    Canby M.E.
    ACM Transactions on Computing for Healthcare, 2021, 2 (01):
  • [5] Fast Support Vector Machine classification of very large datasets
    Fehr, Janis
    Arreola, Karina Zapien
    Burkhardt, Hans
    DATA ANALYSIS, MACHINE LEARNING AND APPLICATIONS, 2008, : 11 - +
  • [6] A refined approach for evaluating small datasets via binary classification using machine learning
    Steinert, Steffen
    Ruf, Verena
    Dzsotjan, David
    Grossmann, Nicolas
    Schmidt, Albrecht
    Kuhn, Jochen
    Kuechemann, Stefan
    PLOS ONE, 2024, 19 (05):
  • [7] A machine learning approach for corrosion small datasets
    Totok Sutojo
    Supriadi Rustad
    Muhamad Akrom
    Abdul Syukur
    Guruh Fajar Shidik
    Hermawan Kresno Dipojono
    npj Materials Degradation, 7
  • [8] Incremental Learning for Malware Classification in Small Datasets
    Li, Jingmei
    Xue, Di
    Wu, Weifei
    Wang, Jiaxiang
    SECURITY AND COMMUNICATION NETWORKS, 2020, 2020
  • [9] A machine learning approach for corrosion small datasets
    Sutojo, Totok
    Rustad, Supriadi
    Akrom, Muhamad
    Syukur, Abdul
    Shidik, Guruh Fajar
    Dipojono, Hermawan Kresno
    NPJ MATERIALS DEGRADATION, 2023, 7 (01)
  • [10] Classification of large acoustic datasets using machine learning and crowdsourcing: Application to whale calls
    Shamir, L. (lshamir@mtu.edu), 1600, Acoustical Society of America (135):