Performance analysis of machine learning algorithms and screening formulae for β-thalassemia trait screening of Indian antenatal women

被引:8
作者
Das, Reena [1 ]
Saleh, Sarkaft [2 ]
Nielsen, Izabela [2 ]
Kaviraj, Anilava [3 ]
Sharma, Prashant [1 ]
Dey, Kartick [4 ]
Saha, Subrata [2 ]
机构
[1] Postgrad Inst Med Educ & Res, Dept Hematol, Chandigarh 160012, India
[2] Aalborg Univ, Dept Mat & Prod, DK-9220 Aalborg, Denmark
[3] Univ Kalyani, Dept Zool, Kalyani 741235, W Bengal, India
[4] Univ Engn & Management, Dept Math, Kolkata 700160, India
关键词
beta-Thalassemia carrier screening; Supervised machine learning algorithm; Multi-criteria decision-making; Antenatal Women; Diagnostic performance; IRON-DEFICIENCY ANEMIA; MULTILAYER PERCEPTRON; CELL INDEXES; DIFFERENTIATION; DIAGNOSIS; DISCRIMINANT; ALTERNATIVES; CLASSIFIER; PREVALENT; CRITERIA;
D O I
10.1016/j.ijmedinf.2022.104866
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Background: Currently, more than forty discrimination formulae based on red blood cell (RBC) parameters and some supervised machine learning algorithms (MLAs) have been recommended for beta-thalassemia trait (BTT) screening. The present study was aimed to evaluate and compare the performance of 26 such formulae and 13 MLAs on antenatal woman data with a recently developed formula SCSBTT, which is available for evaluation in over seventy countries as an Android app, called SUSOKA [16]. Methods: A diagnostic database of 2942 antenatal females were collected from PGIMER, Chandigarh, India and was used for this analysis. The data set consists of hypochromic microcytic anemia, BTT, Hemoglobin E trait, double heterozygote for Hemoglobin S and BTT, heterozygote for Hemoglobin D Punjab and normal subjects. Performance of the formulae and the MLAs were assessed by Sensitivity, Specificity, Youden's Index, and AUCROC measures. A final recommendation was made from the ranking obtained through two Multiple Criteria Decision-Making (MCDM) techniques, namely, Simultaneous Evaluation of Criteria and Alternatives (SECA) and TOPSIS. Results: It was observed that Extreme Learning Machine (ELM) and Gradient Boosting Classifier (GBC) showed maximum Youden's index and AUC-ROC measures compared to all discriminating formulae. Sensitivity remains maximum for SCSBTT. K-means clustering and the ranking from MCDM methods show that SCSBTT, Shine & Lal and Ravanbakhsh-F4 formula ensures higher performance among all formulae. The discriminant power of some MLAs and formulae was found considerably lower than that reported in original studies. Conclusion: Comparative information on MLAs can aid researchers in developing new discriminating formulae that simultaneously ensure higher sensitivity and specificity. More multi-centric verification of the formulae on heterogeneous data is indispensable. SCSBTT and Shine & Lal formula, and ELM and GBC are recommended for screening BTT based on MCDM. SCSBTT can be used with certainty as a tangible cost-saving screening tool for mass screening for antenatal women in India and other countries.
引用
收藏
页数:9
相关论文
共 83 条
  • [1] Identifying beta-thalassemia carriers using a data mining approach: The case of the Gaza Strip, Palestine
    AlAgha, Alaa S.
    Faris, Hossam
    Hammo, Bassam H.
    Al-Zoubi, Ala M.
    [J]. ARTIFICIAL INTELLIGENCE IN MEDICINE, 2018, 88 : 70 - 83
  • [2] A comparative study of K-Nearest Neighbour, Support Vector Machine and Multi-Layer Perceptron for Thalassemia screening
    Amendolia, SR
    Cossu, G
    Ganadu, ML
    Golosio, B
    Masala, GL
    Mura, GM
    [J]. CHEMOMETRICS AND INTELLIGENT LABORATORY SYSTEMS, 2003, 69 (1-2) : 13 - 20
  • [3] [Anonymous], 1999, Lab. Hematol
  • [4] [Anonymous], 1999, J. Pediatr. Hematol. Oncol, DOI DOI 10.1097/00043426-199907000-00040
  • [5] Prioritization of renewable energy resources based on sustainable management approach using simultaneous evaluation of criteria and alternatives: A case study on Iran's electricity industry
    Assadi, Mohammad Reza
    Ataebi, Melikasadat
    Ataebi, Elmira Sadat
    Hasani, Aliakbar
    [J]. RENEWABLE ENERGY, 2022, 181 : 820 - 832
  • [6] Optimal wastewater allocation with the development of an SECA multi-criteria decision-making method
    Azbari, Kosar Ebrahimzadeh
    Ashofteh, Parisa-Sadat
    Golfam, Parvin
    Singh, Vijay P.
    [J]. JOURNAL OF CLEANER PRODUCTION, 2021, 321
  • [7] A computer-based approach for data analyzing in hospital's health-care waste management sector by developing an index using consensus-based fuzzy multi-criteria group decision-making models
    Baghapour, Mohammad Ali
    Shooshtarian, Mohammad Reza
    Javaheri, Mohammad Reza
    Dehghanifard, Sina
    Sefidkar, Razieh
    Nobandegani, Amir Fadaei
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2018, 118 : 5 - 15
  • [8] Security-based critical power distribution feeder identification: Application of fuzzy BWM-VIKOR and SECA
    Bahrami, Sina
    Rastegar, Mohammad
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 134
  • [9] Differential Diagnostics of Thalassemia Minor by Artificial Neural Networks Model
    Barnhart-Magen, Guy
    Gotlib, Victor
    Marilus, Rafael
    Einav, Yulia
    [J]. JOURNAL OF CLINICAL LABORATORY ANALYSIS, 2013, 27 (06) : 481 - 486
  • [10] Predicting length of stay and mortality among hospitalized patients with type 2 diabetes mellitus and hypertension
    Barsasella, Diana
    Gupta, Srishti
    Malwade, Shwetambara
    Aminin
    Susanti, Yanti
    Tirmadi, Budi
    Mutamakin, Agus
    Jonnagaddala, Jitendra
    Syed-Abdul, Shabbir
    [J]. INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 154