Encouraged by the success of CNNs in classification problems, CNNs are being actively applied to image-wide prediction problems such as segmentation, optic flow, reconstruction, restoration etc. These approaches fall under the category of fully convolutional networks [FCN] and have been very successful in bringing contexts into learning for image analysis. In this work, we address the problem of segmentation from medical images. Segmentation or object delineation from medical images/volumes is a fundamental step for subsequent quantification tasks key to diagnosis. Semantic segmentation has been popularly addressed using FCN (e.g. U-NET) with impressive results and has been the fore runner in recent segmentation challenges. However, there are a few drawbacks of FCN approaches which recent works have tried to address. Firstly, local geometry such as smoothness and shape are not reliably captured. Secondly, spatial context captured by FCNs while giving the advantage of a richer representation carries the intrinsic drawback of overfitting, and is quite sensitive to appearance and shape changes. To handle above issues, in this work, we propose a hybrid of generative modeling of image formation to jointly learn the triad of foreground (F), background (B) and shape (S). Such generative modeling of F, B, S would carry the advantages of FCN in capturing contexts. Further we expect the approach to be useful under limited training data, results easy to interpret, and enable easy transfer of learning across segmentation problems. We present similar to 8% improvement over state of art FCN approaches for US kidney segmentation and while achieving comparable results on CT lung nodule segmentation.