Predicting congenital syphilis cases: A performance evaluation of different machine learning models

被引:3
作者
Teixeira, Igor Vitor [1 ]
Leite, Morgana Thalita da Silva [1 ]
Melo, Flavio Leandro de Morais [1 ]
Rocha, Elisson da Silva [1 ]
Sadok, Sara [2 ]
Carrarine, Ana Sofia Pessoa da Costa [3 ]
Santana, Marilia [3 ]
Rodrigues, Cristina Pinheiro [3 ]
Oliveira, Ana Maria de Lima [3 ]
Gadelha, Keduly Vieira [3 ]
de Morais, Cleber Matos [4 ]
Kelner, Judith [5 ]
Endo, Patricia Takako [1 ]
机构
[1] Univ Pernambuco, Programa Posgrad Engn Computacao, Recife, Brazil
[2] Univ Autonoma Barcelona, Genet Asistencial, Barcelona, Spain
[3] Secretaria Saude Estado Pernambuco, Programa Mae Coruja Pernambucana, Recife, Brazil
[4] Univ Fed Paraiba, Dept Midias Digitais, Joao Pessoa, Brazil
[5] Univ Fed Pernambuco, Ctr Informat, Recife, Brazil
来源
PLOS ONE | 2023年 / 18卷 / 06期
基金
比尔及梅琳达.盖茨基金会;
关键词
D O I
10.1371/journal.pone.0276150
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
BackgroundCommunicable diseases represent a huge economic burden for healthcare systems and for society. Sexually transmitted infections (STIs) are a concerning issue, especially in developing and underdeveloped countries, in which environmental factors and other determinants of health play a role in contributing to its fast spread. In light of this situation, machine learning techniques have been explored to assess the incidence of syphilis and contribute to the epidemiological surveillance in this scenario. ObjectiveThe main goal of this work is to evaluate the performance of different machine learning models on predicting undesirable outcomes of congenital syphilis in order to assist resources allocation and optimize the healthcare actions, especially in a constrained health environment. MethodWe use clinical and sociodemographic data from pregnant women that were assisted by a social program in Pernambuco, Brazil, named Mae Coruja Pernambucana Program (PMCP). Based on a rigorous methodology, we propose six experiments using three feature selection techniques to select the most relevant attributes, pre-process and clean the data, apply hyperparameter optimization to tune the machine learning models, and train and test models to have a fair evaluation and discussion. ResultsThe AdaBoost-BODS-Expert model, an Adaptive Boosting (AdaBoost) model that used attributes selected by health experts, presented the best results in terms of evaluation metrics and acceptance by health experts from PMCP. By using this model, the results are more reliable and allows adoption on a daily usage to classify possible outcomes of congenital syphilis using clinical and sociodemographic data.
引用
收藏
页数:25
相关论文
共 44 条
  • [1] [Anonymous], 2007, PROGR MAE COR PERN
  • [2] [Anonymous], 2021, HLTH BRAZ M B EP SIF
  • [3] [Anonymous], 2022, HLTH BRAZ M SIF DEP
  • [4] [Anonymous], 2021, Sexually Transmitted Infections
  • [5] [Anonymous], 2021, HLTH BRAZ M PORT 77
  • [6] Ayyadevara VK., 2018, PROMACHINE LEARNING, P117, DOI [DOI 10.1007/978-1-4842-3564-5_6, DOI 10.1007/978-1-4842-3564-56, 10.1007/978-1-4842-3564-5, DOI 10.1007/978-1-4842-3564-5]
  • [7] HIV incidence among women using intramuscular depot medroxyprogesterone acetate, a copper intrauterine device, or a levonorgestrel implant for contraception: a randomised, multicentre, open-label trial
    Baeten, Jared M.
    Donnell, Deborah
    Gichangi, Peter B.
    Heller, Kate B.
    Hofmeyr, G. Justus
    Kiarie, James
    Mastro, Timothy D.
    Morrison, Charles S.
    Mugo, Nelly R.
    Nanda, Kavita
    Palanee-Phillips, Thesla
    Pleaner, Melanie
    Rees, Helen
    Scoville, Caitlin W.
    Shears, Kathleen
    Steyn, Petrus S.
    Taylor, Douglas
    Thomas, Katherine K.
    Welch, Julia D.
    Justman, Jessica
    Nhlabatsi, Zelda
    Bukusi, Elizabeth A.
    Onono, Maricianah
    Louw, Cheryl
    Bekker, Linda-Gail
    Nair, Gonasagrie
    Smit, Jennifer
    Hofmeyr, G. Justus
    Singata-Madliki, Mandisa
    Smit, Edendale Jennifer
    Palanee-Phillips, Thesla
    Selepe, Raesibe Agnes Pearl
    Sibiya, Sydney
    Ahmed, Khatija
    Kasaro, Margaret Phiri
    Stringer, Jeffrey
    [J]. LANCET, 2019, 394 (10195) : 303 - 313
  • [8] Batista GEAPA., 2004, ACM SIGKDD EXPL NEWS, V6, P20, DOI [DOI 10.1145/1007730.1007735, 10.1145/1007730.1007735, 10.1145/1007730.1007735.2]
  • [9] Bhavsar H., 2012, Int J Soft Comput Eng, V2, P2231
  • [10] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32