Photoelasticity has been widely utilized for the analysis of changes in the properties of micro-sized objects under variational forces. In machine-learning-based predictive maintenance, failure prediction is realized using sensor data and system signals. However, predictive maintenance using photoelasticity and machine learning is yet to be reported because of the difficulty in obtaining time-staged photoelastic images. The prediction of a future photoelastic image using past photoelastic images may facilitate near-future failure identification so that timely preventive measures can be implemented for mechanical or structural components. Such prediction is a spatiotemporal problem, which requires considerable recurrent data for temporal relations and image-based handling of spatial information. In this study, we first generated datasets of synthetic time-staged photoelastic images. Next, several combinations of deep learning models, including an autoencoder, a recurrent neural network, and a convolutional neural network, were examined. Finally, this study proposes a novel multi-stage convolutional autoencoder model for failure prediction. The effectiveness of the proposed model in terms of loss and accuracy was verified through numerical tests. The results revealed that the proposed model is an effective prediction framework for image-based future prediction of micro/nanosized materials. Furthermore, the proposed model can be extended to other applications that require image-based spatiotemporal data.