Machine Learning-Based Prediction of No-Show Telemedicine Encounters

被引:0
作者
Reategui-Rivera, C. Mahony [1 ]
Cui, Wanting [1 ]
Escobar-Agreda, Stefan [2 ]
Rojas-Mezarina, Leonardo [2 ]
Finkelstein, Joseph [1 ]
机构
[1] Univ Utah, Sch Med, Dept Biomed Informat, 421 Wakara Way, Salt Lake City, UT 84108 USA
[2] Univ Nacl Mayor San Marcos, Sch Med, Telehlth Unit, Lima, Peru
来源
TELEMEDICINE REPORTS | 2025年 / 6卷 / 01期
基金
美国国家卫生研究院;
关键词
artificial intelligence; machine learning; no-show; Peru; prediction; telehealth; telemedicine; COST;
D O I
10.1089/tmr.2025.0009
中图分类号
R19 [保健组织与事业(卫生事业管理)];
学科分类号
摘要
Aim: This study aimed to evaluate the performance of machine learning (ML) models in predicting patient no-shows for telemedicine appointments within Peruvian health system and identify key predictors of nonattendance.Methods: We performed a retrospective observational study using anonymized data (June 2019-November 2023) from "Teleatiendo." The dataset included over 1.5 million completed appointments and about 64,000 no-shows (4.1%), focusing on teleorientation and telemonitoring. Predictor variables included patient demographics, socioeconomic factors, health care facility characteristics, appointment timing, and telemedicine service types. A 70% training, 10% validation, and 20% testing split were used over 10 iterations, with hyperparameter tuning performed on the validation set to identify optimal model parameters. Multiple ML approaches-random forest, XGBoost, LightGBM, and anomaly detection-were implemented in combination with undersampling and cost-sensitive learning to address class imbalance. Performance was evaluated using precision, recall, specificity, area under the curve (AUC), F1-score, and accuracy.Results: Of the models tested, undersampling with XGBoost achieved a precision of 0.115 (+/- 0.001), recall of 0.654 (+/- 0.005), specificity of 0.786 (+/- 0.002), AUC of 0.720 (+/- 0.002), and accuracy of 0.780 (+/- 0.002). In contrast, cost-sensitive XGBoost exhibited a balanced performance with a precision of 0.123 (+/- 0.001), recall of 0.639 (+/- 0.006), specificity of 0.805 (+/- 0.004), AUC of 0.722 (+/- 0.001), and accuracy of 0.799 (+/- 0.003). Additionally, cost-sensitive random forest achieved the highest specificity (0.843 +/- 0.002) and accuracy (0.832 +/- 0.001) but recorded a lower recall (0.585 +/- 0.004), while cost-sensitive LightGBM and balanced random forest yielded performance metrics similar to cost-sensitive XGBoost. Isolation forest, used for abnormality detection, demonstrated the lowest performance.Conclusions: ML models can moderately predict telemedicine no-shows in Peru, with cost-sensitive boosting techniques enhancing the identification of high-risk patients. Key predictors reflect both individual behavior and system-level contexts, suggesting the need for tailored, context-specific interventions. These findings can inform targeted strategies to optimize telemedicine, improve appointment adherence, and promote equitable health care access.
引用
收藏
页码:109 / 119
页数:11
相关论文
共 26 条
  • [21] Retrospective cohort study of clinical characteristics of 2199 hospitalised patients with COVID-19 in New York City
    Paranjpe, Ishan
    Russak, Adam J.
    De Freitas, Jessica K.
    Lala, Anuradha
    Miotto, Riccardo
    Vaid, Akhil
    Johnson, Kipp W.
    Danieletto, Matteo
    Golden, Eddye
    Meyer, Dara
    Singh, Manbir
    Somani, Sulaiman
    Kapoor, Arjun
    O'Hagan, Ross
    Manna, Sayan
    Nangia, Udit
    Jaladanki, Suraj K.
    O'Reilly, Paul
    Huckins, Laura M.
    Glowe, Patricia
    Kia, Arash
    Timsina, Prem
    Freeman, Robert M.
    Levin, Matthew A.
    Jhang, Jeffrey
    Firpo, Adolfo
    Kovatch, Patricia
    Finkelstein, Joseph
    Aberg, Judith A.
    Bagiella, Emilia
    Horowitz, Carol R.
    Murphy, Barbara
    Fayad, Zahi A.
    Narula, Jagat
    Nestler, Eric J.
    Fuster, V
    Cordon-Cardo, Carlos
    Charney, Dennis
    Reich, David L.
    Just, Allan
    Bottinger, Erwin P.
    Charney, Alexander W.
    Glicksberg, Benjamin S.
    Nadkarni, Girish N.
    [J]. BMJ OPEN, 2020, 10 (11):
  • [22] A simulation study of the number of events per variable in logistic regression analysis
    Peduzzi, P
    Concato, J
    Kemper, E
    Holford, TR
    Feinstein, AR
    [J]. JOURNAL OF CLINICAL EPIDEMIOLOGY, 1996, 49 (12) : 1373 - 1379
  • [23] Presidencia de la Republica del Peru, 2020, Decreto Legislativo N. 1490. Decreto Legislativo que fortalece los alcances de la Telesalud
  • [24] Saldaa Dvila KA, 2019, Univ San Martn Porres USMP
  • [25] Stratifying no-show patients into multiple risk groups via a holistic data analytics-based framework
    Simsek, Serhat
    Tiahrt, Thomas
    Dag, Ali
    [J]. DECISION SUPPORT SYSTEMS, 2020, 132
  • [26] Woods R, 2011, NURS ECON, V29, P278