Satellite images have broad coverage, thus are ideal for large-scale urban reconstruction tasks. However, their low ground sampling resolution posed great challenges in using traditional volumetric or stereo methods to perform 3D reconstructions. In this paper, we propose a novel deep learning based approach to perform single-view parametric reconstructions from satellite imagery. By parametrizing buildings as 3D cuboids, our method extends object detection systems to simultaneously localize buildings and directly fit parametric models for each identified building. We utilize geo-registered GIS vector maps and Lidar data as supervision to train the network. Especially, we deconvolve the feature maps and combine convolutional feature maps at different stages of the network to deal with the heavily cluttered but small in size building instances from satellite imagery. We further enforce physical constraints that building cannot overlap by predicting building boundaries using a separate fully convolutional network. We demonstrate the effectiveness of our proposed methods on real world data.