Deep Learning algorithms have been proven to provide state-of-the-art results in many applications but at the cost of a high computational complexity. Therefore, accelerating such algorithms in hardware is highly needed. However, since the computational requirements are growing exponentially along with the accuracy, their demand for hardware resources is significant. To tackle this issue, we propose a methodology, involving both software and hardware, to optimize the Deep Neural Networks (DNNs). We discuss and analyze pruning, approximations through quantization and specialized accelerators for DNN inference. For each phase of the methodology, we provide quantitative comparisons with the existing techniques and hardware platforms.