The complex structures and intricate hyperparameters of existing deep learning (DL) models make achieving higher accuracy in landslide susceptibility assessment (LSA) time-consuming and labor-intensive. Deep forest (DF) is a decision tree-based DL framework that uses a cascade structure to process features, with model depth adapting to the input data. To explore a more ideal landslide susceptibility model, this study designed a landslide susceptibility model combining convolutional neural networks (CNNs) and DF, referred to as CNN-DF. The Bailong River Basin, a region severely affected by landslides, was chosen as the study area. First, the landslide inventory and influencing factors of the study area were obtained. Second, an equal number of landslide and nonlandslide samples were selected under similar environmental constraints to establish the dataset. Third, CNN was used to extract high-level features from the raw data, which were then input into the DF model for training and testing. Finally, the trained model was used to predict landslide susceptibility. The results showed that the CNN-DF model achieved high prediction accuracy, with an AUC of 0.9061 on the testing set, outperforming DF, CNN, and other commonly used machine learning models. In landslide susceptibility maps (LSMs), the proportion of historical landslides in the very high susceptibility category of CNN-DF was also higher than that of other models. CNN-DF is feasible for LSA, offering higher efficiency and more accurate results. In addition, the SHAP algorithm was used to quantify the contribution of features to the prediction results both globally and locally, further explaining the model. The LSM based on CNN-DF can provide a scientific basis for landslide prevention and disaster management in the target area.