The ever expanding multimedia content (such as images and videos), especially on the web, necessitates effective text query-based search (or retrieval) systems. Popular approaches for addressing this issue, use the query-likelihood model which fails to capture the user's information needs. In this work therefore, we explore a new ranking approach in the context of image and video retrieval from text queries. Our approach assumes two separate underlying distributions for query and the document respectively. We then, determine the extent of similarity between these two statistical distributions for the task of ranking. Furthermore we extend our approach, using Active Learning techniques, to address the question of obtaining a good performance without requiring a fully labeled training dataset. This is done by taking Sample Uncertainty, Density and Diversity into account. Our experiments on the popular TRECVID corpus and the open, relatively small-sized USC SmartBody corpus show that we are almost at-par or sometimes better than multiple state-of-the-art baselines.