This paper serves as a survey and empirical evaluation of the state-of-the-art in activity recognition methods using still RGB images and/or videos. Understanding human activities from videos or still images is a challenging task in computer vision domain. Identifying the action or activity being accomplished automatically and then recognizing it represents the prime goal of an intelligent video system. Human Activity Recognition arises in various application domains varying from human computer interfaces, health care monitoring to surveillance and security. Despite the ongoing efforts in the domain, these tasks remained unsolved in unconstrained environments and face many challenges such as occlusions, variations in clothing and background clutter. Recently, numerous deep learning algorithms have been proposed to solve traditional artificial intelligence problems. They have shown great advances, in particular for pose estimation task since they can extract appropriate features while jointly performing discrimination. In this paper, we provide a detailed review of recent and state-of-the-art research advances in the field of human activity recognition. We propose a categorization of human activity methodologies and discuss their advantages and limitations. In particular, we divide feature representation methods into global, local and body modeling. Then, human activity classification approaches are arranged into three categories, which reflect how they model human activities: template-based, generative and discriminative. Moreover, we provide a comprehensive analysis of pose-based human activity recognition where both conventional and deep learning-based human pose estimation approaches are reported. Finally, we discuss the open-challenges in this field and endeavor to provide possible solutions.