BackgroundDepression is very common in middle-aged and elderly cancer patients, which will seriously damage the quality of life and treatment effect of patients. This study aims to use machine learning methods to develop a predictive model to identify depression risk. However, since the traditional machine learning models have 'black box nature', Shapley Additive exPlanation is used to determine the key risk factors.MethodsThis study included 743 cancer patients aged 45 and above from the 2011-2020 China Health and Retirement Longitudinal Study (CHARLS), and analyzed a total of 19 variables including demographic factors, economic factors, health factors, family factors, and personal factors. After screening the predictive features by LASSO regression, in order to determine the best model for prediction, six machine learning models-Support Vector Machine, K-Nearest Neighbors, Multi-layer Perceptron, Decision Tree, XGBoost and Random Forest were trained.ResultsAfter training, the random forest model showed the best decision performance, AUC (95% CI): 0.774 (0.740-0.809). Subsequently, the model was interpreted by Shapley Additive exPlanation, and five key characteristics affecting the risk of depression were found. The feature importance plot intuitively shows that the predicted depression risk is significantly increased for patients with poor life satisfaction.ConclusionsWe developed a clinical visualization model for health care providers to estimate the risk of depression in middle-aged and elderly cancer patients. As a powerful tool for early identification of depressive symptoms in middle-aged and elderly cancer patients, this model enables medical workers to implement clinical interventions earlier to obtain better clinical benefits.