Accurate detection of objects in aerial imagery is a crucial image processing step for many applications, such as traffic monitoring, surveillance, reconnaissance and rescue tasks. Recently, conventional methods for vehicle detection in aerial imagery are outperformed by deep learning based detection frameworks like Faster R-CNN. To allow the detection of small vehicles in the range of 10x20 pixels, only shallow layers of standard models like VGG-16 provide a sufficiently high spatial resolution. However, this adaptation to the characteristics of aerial imagery results in poor inference time as proposals or detections are predicted at each feature map location. In this paper, we propose an adaptive model which reduces the input region for the detection module and consequently reduces inference time. For this, we extend Faster R-CNN by an additional Search Area Reduction module which divides the input image into regions and predicts a confidence score of how likely a region contains at least one object. Many image regions, particularly in rural areas, do not contain any vehicles and are filtered out by our approach. This significantly reduces inference time of later stages. Our proposed framework achieves state-of-the-art detection results on a publicly available dataset while the inference time of the object detection stage is reduced by more than 75%.