Robust feature selection is crucial for enhancing the credibility and interpretability of machine learning models. Traditionally, deep learning networks directly learn features from raw data. However, the multidimensional data collected by sensors in dynamic systems may contain redundancy, noise, and high dimensionality, making it challenging to select the optimal feature set. To tackle this concern, we introduce a feature selection prediction framework based on causal discovery algorithms. It first identifies key features and learns their causal relationships, providing more interpretable and effective features. Subsequently, deep learning models are employed for prediction. This paper introduces a long short-term memory model that incorporates causal discovery and attention mechanisms. Our framework is applied to predict the remaining useful life (RUL) on the C-MAPSS dataset, demonstrating that causal feature selection contributes to the enhanced reliability, interpretability, and generalization of the RUL prediction model. Our approach outperforms traditional feature-unselected algorithms in terms of both generalization performance and interpretability.