Electronic Health Records (EHR) is growing at an exponential rate that is being stored in enterprise databases or cloud storages. These records have now grown to be called as Big Data. Most of these data are unstructured. The data can be efficiently processed on cloud for lowering the processing costs. Predictive analytics help the physicians, doctors to identify the patient admission to hospital at early stage. To perform predictive analytics various factors with demographic data, hospital parameters, patient past history and various indicators for a specific disease. But identifying the strong indicators for accurate prediction is a challenging task. From the factors being considered for predictive analytics various models and algorithms need to be studied. Classification algorithms like Naive Bayes, Linear Regression; generalized additive model, Random Forest, Logistic Regression, Hidden Markov Models has to be considered for developing a predictive models. In this paper we propose a predictive model using scalable Random forest classification algorithm which can accurately identify the classifier rate for risk of diabetes.