Data Mining Techniques for Endometriosis Detection in a Data-Scarce Medical Dataset

被引:0
作者
Caballero, Pablo [1 ]
Gonzalez-Abril, Luis [2 ]
Ortega, Juan A. [3 ]
Simon-Soro, Aurea [4 ]
机构
[1] Univ Seville, Int Doctorate Sch, Comp Engn Programme, Seville 41013, Spain
[2] Univ Seville, Fac Econ & Business Sci, Dept Appl Econ 1, Seville 41018, Spain
[3] Univ Seville, Higher Tech Sch Comp Engn, Dept Comp Sci, Seville 41012, Spain
[4] Univ Seville, Fac Dent, Dept Stomatol, Seville 41009, Spain
关键词
endometriosis; machine learning; artificial intelligence; biomarkers; microbiome; oral systemic; healthcare; SVM; IMBALANCED DATASETS; CLASSIFICATION; DIAGNOSIS; SVM;
D O I
10.3390/a17030108
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Endometriosis (EM) is a chronic inflammatory estrogen-dependent disorder that affects 10% of women worldwide. It affects the female reproductive tract and its resident microbiota, as well as distal body sites that can serve as surrogate markers of EM. Currently, no single definitive biomarker can diagnose EM. For this pilot study, we analyzed a cohort of 21 patients with endometriosis and infertility-associated conditions. A microbiome dataset was created using five sample types taken from the reproductive and gastrointestinal tracts of each patient. We evaluated several machine learning algorithms for EM detection using these features. The characteristics of the dataset were derived from endometrial biopsy, endometrial fluid, vaginal, oral, and fecal samples. Despite limited data, the algorithms demonstrated high performance with respect to the F1 score. In addition, they suggested that disease diagnosis could potentially be improved by using less medically invasive procedures. Overall, the results indicate that machine learning algorithms can be useful tools for diagnosing endometriosis in low-resource settings where data availability and availability are limited. We recommend that future studies explore the complexities of the EM disorder using artificial intelligence and prediction modeling to further define the characteristics of the endometriosis phenotype.
引用
收藏
页数:25
相关论文
共 47 条
[1]   Biomarkers for the Noninvasive Diagnosis of Endometriosis: State of the Art and Future Perspectives [J].
Anastasiu, Costin Vlad ;
Moga, Marius Alexandru ;
Elena Neculau, Andrea ;
Balan, Andreea ;
Scarneciu, Ioan ;
Dragomir, Roxana Maria ;
Dull, Ana-Maria ;
Chicea, Liana-Maria .
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2020, 21 (05)
[2]   Can synthetic data be a proxy for real clinical trial data? A validation study [J].
Azizi, Zahra ;
Zheng, Chaoyi ;
Mosquera, Lucy ;
Pilote, Louise ;
El Emam, Khaled .
BMJ OPEN, 2021, 11 (04)
[3]   Strategies for learning in class imbalance problems [J].
Barandela, R ;
Sánchez, JS ;
García, V ;
Rangel, E .
PATTERN RECOGNITION, 2003, 36 (03) :849-851
[4]   Analyzing Medical Research Results Based on Synthetic Data and Their Relation to Real Data Results: Systematic Comparison From Five Observational Studies [J].
Benaim, Anat Reiner ;
Almog, Ronit ;
Gorelik, Yuri ;
Hochberg, Irit ;
Nassar, Laila ;
Mashiach, Tanya ;
Khamaisi, Mogher ;
Lurie, Yael ;
Azzam, Zaher S. ;
Khoury, Johad ;
Kurnik, Daniel ;
Beyar, Rafael .
JMIR MEDICAL INFORMATICS, 2020, 8 (02)
[5]   Machine Learning for Endometrial Cancer Prediction and Prognostication [J].
Bhardwaj, Vipul ;
Sharma, Arundhiti ;
Parambath, Snijesh Valiya ;
Gul, Ijaz ;
Zhang, Xi ;
Lobie, Peter E. ;
Qin, Peiwu ;
Pandey, Vijay .
FRONTIERS IN ONCOLOGY, 2022, 12
[6]   Reproductive tract microbiome and therapeutics of infertility [J].
Bhattacharya, Koushik ;
Dutta, Sulagna ;
Sengupta, Pallav ;
Bagchi, Sovan .
MIDDLE EAST FERTILITY SOCIETY JOURNAL, 2023, 28 (01)
[7]   Optimizing taxonomic classification of marker-gene amplicon sequences with QIIME 2′s q2-feature-classifier plugin [J].
Bokulich, Nicholas A. ;
Kaehler, Benjamin D. ;
Rideout, Jai Ram ;
Dillon, Matthew ;
Bolyen, Evan ;
Knight, Rob ;
Huttley, Gavin A. ;
Caporaso, J. Gregory .
MICROBIOME, 2018, 6
[8]   Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2 [J].
Bolyen, Evan ;
Rideout, Jai Ram ;
Dillon, Matthew R. ;
Bokulich, NicholasA. ;
Abnet, Christian C. ;
Al-Ghalith, Gabriel A. ;
Alexander, Harriet ;
Alm, Eric J. ;
Arumugam, Manimozhiyan ;
Asnicar, Francesco ;
Bai, Yang ;
Bisanz, Jordan E. ;
Bittinger, Kyle ;
Brejnrod, Asker ;
Brislawn, Colin J. ;
Brown, C. Titus ;
Callahan, Benjamin J. ;
Caraballo-Rodriguez, Andres Mauricio ;
Chase, John ;
Cope, Emily K. ;
Da Silva, Ricardo ;
Diener, Christian ;
Dorrestein, Pieter C. ;
Douglas, Gavin M. ;
Durall, Daniel M. ;
Duvallet, Claire ;
Edwardson, Christian F. ;
Ernst, Madeleine ;
Estaki, Mehrbod ;
Fouquier, Jennifer ;
Gauglitz, Julia M. ;
Gibbons, Sean M. ;
Gibson, Deanna L. ;
Gonzalez, Antonio ;
Gorlick, Kestrel ;
Guo, Jiarong ;
Hillmann, Benjamin ;
Holmes, Susan ;
Holste, Hannes ;
Huttenhower, Curtis ;
Huttley, Gavin A. ;
Janssen, Stefan ;
Jarmusch, Alan K. ;
Jiang, Lingjing ;
Kaehler, Benjamin D. ;
Bin Kang, Kyo ;
Keefe, Christopher R. ;
Keim, Paul ;
Kelley, Scott T. ;
Knights, Dan .
NATURE BIOTECHNOLOGY, 2019, 37 (08) :852-857
[9]   A fuzzy random forest [J].
Bonissone, Piero ;
Cadenas, Jose M. ;
Carmen Garrido, M. ;
Andres Diaz-Valladares, R. .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2010, 51 (07) :729-747
[10]   Inflammasome as a Key Pathogenic Mechanism in Endometriosis [J].
Bullon, Pedro ;
Manuel Navarro, Jose .
CURRENT DRUG TARGETS, 2017, 18 (09) :997-1002