This paper presents a deep learning-based model for multi-temporal urban development prediction using satellite imagery. The model is designed to project the evolution of urban areas, encompassing historical trends as well as future developments, enabling informed decision-making in urban planning. The model achieves enhanced performance through iterative improvements and adaptations in hyperparameters, augmentation techniques, loss functions, and model structure. Fine-tuning and activation function adjustments further optimize the model's predictive capabilities. Evaluation of diverse datasets showcases the model's robustness and applicability. The key findings highlight the impact of various improvements on the model's performance. Notably, introducing the time skip as an Embedding layer proved to be a valuable choice, enabling the model to capture temporal dependencies more effectively. Additionally, shifting our focus to the pixel-level differences between the target and input images provided the model with a more informative learning signal, leading to improved predictions. These enhancements, coupled with hyperparameter optimization, suitable augmentations, and adjustments in the loss calculation, collectively contributed to significant advancements in the model's overall performance.