Machine Learning for the Prediction of Anemia in Children Under 5 Years of Age by Analyzing their Nutritional Status Using Data Mining

被引:0
作者
Valdez, Alexander J. Marcos [1 ]
Ortiz, Eduardo G. Navarro [1 ]
Peralta, Rodrigo E. Quinteros [1 ]
Julca, Juan J. Tirado [1 ]
Ricaldi, David F. Valentin [1 ]
Calderon-Vilca, Hugo D. [1 ]
机构
[1] Univ Nacl Mayor San Marcos, San Marcos, Peru
来源
COMPUTACION Y SISTEMAS | 2023年 / 27卷 / 03期
关键词
Anemia; predictive model; malnutrition; children; data mining;
D O I
10.13053/CyS-27-3-4315
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
One of the main public health problems is child malnutrition, since it negatively affects the individual throughout his life, limits the development of society and makes it difficult to eradicate poverty. The first objective of this research is to apply data mining techniques for preprocessing, cleaning, reduction and transformation to a data lake that has allowed analyzing anemia in children under 5 years of age, the second objective is to apply Machine Learning algorithms to obtain the best model to predict anemia in children under 5 years of age. The data set was extracted from the open data platform of the government of Peru that corresponds to South Lima, North Lima, East Lima, Central Lima and rural Lima, which collected a total of 138,369 instances and 36 variables of which 30 are categorical and 6 numeric, being an unbalanced data set. In order to obtain the best predictor variables, the Anova F-test and Chi Square filters were used, and it was possible to reduce them to 10 variables, cases were also carried out without considering one of the filters and both filters.To find the best prediction model, the algorithms have been tested: decision tree, logistic regression, K nearest neighbors, random forest and naive bayes. As a result, we show that the best algorithm to predict anemia in children under 5 years of age is the Naive Bayes algorithm with the highest recall of 74%, precision of 43% and accuracy of 70%.
引用
收藏
页码:749 / 768
页数:20
相关论文
共 33 条
  • [21] Martin R., 2007, Gaceta Modica Espirituana, V9
  • [22] Using classification techniques for statistical analysis of Anemia
    Meena, Kanak
    Tayal, Devendra K.
    Gupta, Vaidehi
    Fatima, Aiman
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2019, 94 : 138 - 152
  • [23] Menebe Z., 2014, Journal of Health Medical Informatics, V5, P152, DOI [DOI 10.4172/2157-7420, 10.4172/2157-7420.1000152, DOI 10.4172/2157-7420.1000152]
  • [24] Momand Ziaullah, 2020, 2020 12th International Conference on Knowledge and Smart Technology (KST), P12, DOI 10.1109/KST48564.2020.9059388
  • [25] Organizacion Mundial de la Salud, 2021, Anemia
  • [26] Optimizing data collection for public health decisions: a data mining approach
    Partington, Susan N.
    Papakroni, Vasil
    Menzies, Tim
    [J]. BMC PUBLIC HEALTH, 2014, 14
  • [27] Ramos-Padilla P., 2018, Revista Espanola de Nutricion Humana y Dietotica, V22, P287, DOI [10.14306/RENHYD.22.4.695, DOI 10.14306/RENHYD.22.4.695]
  • [28] Repositorio datos anemia, 2022, Repositorio de datos de anemia en ninos menores de 5 anos
  • [29] Accurately Inferring Compliance to Five Major Food Guidelines Through Simplified Surveys: Applying Data Mining to the UK National Diet and Nutrition Survey
    Rosso, Nicholas
    Giabbanelli, Philippe
    [J]. JMIR PUBLIC HEALTH AND SURVEILLANCE, 2018, 4 (02): : 315 - 328
  • [30] Sanchez-Abanto J, 2014, Revista Peruana de Medicina Experimental y Salud Publica, V29, P402, DOI [10.17843/RPMESP.2012.293.377, DOI 10.17843/RPMESP.2012.293.377]