Automatic face recognition in the wild still suffers from low-quality, low resolution, noisy, and occluded input images that can severely impact identification accuracy. In this paper, we present a novel technique to enhance the quality of such extreme low-resolution face images beyond the current state of the art. We model the correlation between high and low resolution faces in a multi-resolution pyramid and show that we can recover the original structure of an un-seen extreme low-resolution face image. By exploiting domain knowledge of the structure of the input signal and using sparse recovery optimization algorithms, we can recover a consistent sparse representation of the extreme low-resolution signal. The proposed super-resolution method is robust to noise and face alignment, and can handle extreme low-resolution faces up to 16x magnification factor with just 7 pixels between the eyes. Moreover, the formulation of the proposed algorithm allows for simultaneous occlusion removal capability, a desirable property that other super-resolution algorithms do not possess, to the best of our knowledge. Most importantly, we show that our method generalizes on real-world low-quality surveillance images, showing the potentially big impact this can have in a real-world scenario. Keywords: Sparse signal recovery (SSR) Single-image super-resolution (SSR) Extreme low resolution (C) 2019 Elsevier Ltd. All rights reserved.