Machine Learning and Risk Assessment: Random Forest Does Not Outperform Logistic Regression in the Prediction of Sexual Recidivism

被引:4
|
作者
Etzler, Sonja [1 ,2 ]
Schonbrodt, Felix D. [3 ]
Pargent, Florian [3 ]
Eher, Reinhard [4 ]
Rettenberger, Martin [2 ,5 ]
机构
[1] Goethe Univ Frankfurt Main, Frankfurt, Germany
[2] Ctr Criminol Kriminol Zent Stelle KrimZ, Luisenstr 7, D-65185 Wiesbaden, Germany
[3] Ludwig Maximilians Univ Munchen, Munich, Germany
[4] Austrian Minist Justice, Fed Evaluat Ctr Violent & Sexual Offenders, Vienna, Austria
[5] Johannes Gutenberg Univ Mainz JGU, Mainz, Germany
关键词
risk assessment; actuarial instruments; sexual offenses; Static-99; Stable-2007; machine learning; predictive validity; VIOLENCE RISK; ACTUARIAL ASSESSMENT; LINEAR-MODELS; OFFENDERS; CLASSIFICATION; TREE; METAANALYSIS; INFERENCE; ACCURACY; ABSOLUTE;
D O I
10.1177/10731911231164624
中图分类号
B849 [应用心理学];
学科分类号
040203 ;
摘要
Although many studies supported the use of actuarial risk assessment instruments (ARAIs) because they outperformed unstructured judgments, it remains an ongoing challenge to seek potentials for improvement of their predictive performance. Machine learning (ML) algorithms, like random forests, are able to detect patterns in data useful for prediction purposes without explicitly programming them (e.g., by considering nonlinear effects between risk factors and the criterion). Therefore, the current study aims to compare conventional logistic regression analyses with the random forest algorithm on a sample of N = 511 adult male individuals convicted of sexual offenses. Data were collected at the Federal Evaluation Center for Violent and Sexual Offenders in Austria within a prospective-longitudinal research design and participants were followed-up for an average of M = 8.2 years. The Static-99, containing static risk factors, and the Stable-2007, containing stable dynamic risk factors, were included as predictors. The results demonstrated no superior predictive performance of the random forest compared with logistic regression; furthermore, methods of interpretable ML did not point to any robust nonlinear effects. Altogether, results supported the statistical use of logistic regression for the development and clinical application of ARAIs.
引用
收藏
页码:460 / 481
页数:22
相关论文
共 50 条
  • [1] Machine learning for skin permeability prediction: random forest and XG boost regression
    Ita, Kevin
    Prinze, Joyce
    JOURNAL OF DRUG TARGETING, 2024, 32 (01) : 57 - 65
  • [2] Credit Risk Prediction: a comparative study between logistic regression and logistic regression with random effects
    Mestiri, Sami
    Hamdi, Manel
    INTERNATIONAL JOURNAL OF MANAGEMENT SCIENCE AND ENGINEERING MANAGEMENT, 2012, 7 (03) : 200 - 204
  • [3] Mortality risk prediction in burn injury: Comparison of logistic regression with machine learning approaches
    Stylianou, Neophytos
    Akbarov, Artur
    Kontopantelis, Evangelos
    Buchan, Iain
    Dunn, Ken W.
    BURNS, 2015, 41 (05) : 925 - 934
  • [4] Prediction of unsuccessful endometrial ablation: random forest vs logistic regression
    Stevens, Kelly Yvonne Roger
    Lagaert, Liesbet
    Bakkes, Tom
    Gelderblom, Malou Evi
    Houterman, Saskia
    Gijsen, Tanja
    Schoot, Benedictus C.
    GYNECOLOGICAL SURGERY, 2021, 18 (01)
  • [5] Gradient boosting approaches can outperform logistic regression for risk prediction in cutaneous allergy
    Cunningham, Louise
    Ganier, Clarisse
    Ferguson, Felicity
    White, Ian R.
    Watt, Fiona M.
    McFadden, John
    Lynch, Magnus D.
    CONTACT DERMATITIS, 2022, 86 (03) : 165 - 174
  • [6] Evaluation of Random Forest in Crime Prediction: Comparing Three-Layered Random Forest and Logistic Regression
    Oh, Gyeongseok
    Song, Juyoung
    Park, Hyoungah
    Na, Chongmin
    DEVIANT BEHAVIOR, 2022, 43 (09) : 1036 - 1049
  • [7] A Comparison of Logistic Regression, Random Forest Models in Predicting the Risk of Diabetes
    Zhang, Baoxin
    Lu, Li
    Hou, Jiaqi
    THIRD INTERNATIONAL SYMPOSIUM ON IMAGE COMPUTING AND DIGITAL MEDICINE (ISICDM 2019), 2019, : 231 - 234
  • [8] The Application of Machine Learning to a General Risk-Need Assessment Instrument in the Prediction of Criminal Recidivism
    Ghasemi, Mehdi
    Anvari, Daniel
    Atapour, Mahshid
    Stephen Wormith, J.
    Stockdale, Keira C.
    Spiteri, Raymond J.
    CRIMINAL JUSTICE AND BEHAVIOR, 2021, 48 (04) : 518 - 538
  • [9] Prediction of Gastrointestinal Bleeding Hospitalization Risk in Hemodialysis: Machine Learning vs. Logistic Regression
    Lama, Suman Kumar
    Larkin, John W.
    Chaudhuri, Sheetal
    Jiao, Yue
    Winter, Anke
    StaussGrabo, Manuela
    Usvyat, Len A.
    Hymes, Jeffrey L.
    Maddux, Franklin W.
    Wheeler, David C.
    Stenvinkel, Peter
    Floege, Jurgen
    JOURNAL OF THE AMERICAN SOCIETY OF NEPHROLOGY, 2024, 35 (10):
  • [10] A comparison of regularized logistic regression and random forest machine learning models for daytime diagnosis of obstructive sleep apnea
    Hajipour, Farahnaz
    Jozani, Mohammad Jafari
    Moussavi, Zahra
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2020, 58 (10) : 2517 - 2529