Outdoor weather conditions such as haze, fog, sand dust, and low light significantly degrade image quality, causing color distortions, low contrast, and poor visibility. In spite of the significant importance of restoring such degraded images, challenges still exist in haze removal and sand dust image enhancement and other restoration tasks, making this field relatively underexplored. While Encoder-Decoder-based neural networks have shown noticeable improvements in image restoration, their ability to further improve the image quality still remains constrained. Recent advancements in vision transformers and self-attention mechanisms have achieved remarkable success in various computer vision tasks. However, directly applying Vision Transformers for image restoration presents serious challenges, including feature extraction between local and global representations. This research aims to address these limitations by restoring both sand dust and haze degraded images to a more natural and visually realistic appearance, ensuring enhanced visibility, balanced colors, and refined details. We propose a novel hybrid architecture that combines depth-wise local feature extraction using lightweight Encoders with global feature extraction via Vision Transformers. These features are fused through an attention fusion mechanism, ensuring seamless interaction between local and global feature representations. Finally, a single lightweight Decoder reconstructs a high-quality restored image that closely matches the ground truth. The proposed method effectively reduces feature inconsistency between Vision Transformer-based global features and lightweight encoder-based local features, leading to state-of-the-art performance in both synthetic and real-world sand dust and haze-degraded images. Extensive evaluations show that our proposed method outperforms all previously existing image restoration methods, delivering improved visibility, realistic textures, and superior image quality. Degraded images exhibiting varying degrees of color cast from mild to severe are evaluated both qualitatively and quantitatively. In addition, a comparison of training and testing time and the novel Energy Efficiency Index (EEI) analysis is assessed. The results show that the proposed method outperforms all previous conventional and advanced deep learning methods in terms of visual quality, evaluation metrics, training and testing time, and novel EEI.