Flight delay is a fundamental problem present in the global aviation system. Delay-inducing disruptions are caused by various reasons, including increased global connectivity / dependency, weather phenomena, and limited infrastructure resources. Given the excessive amount of time and money lost due to delays in aircraft operations, the prediction of flight delays has become an active research topic in recent years. Particularly, the existence of code repositories for standardized machine learning applications as well as the available data on this subject, has led to an increasing number of papers, usually comparing the performance of different new and existing delay prediction techniques. In this study, we review the contributions of papers to delay prediction in recent years. Based on a six-step comparison framework, covering many aspects, starting with data collection / processing, including, e.g., model and feature selection, and ending with evaluation, integration of various technologies, and reproducibility considerations, we find that although the current studies have put forward effective concerns, there are still some challenges in the field of delay prediction. We elaborate on how to overcome this stage by discussing a set of research directions, which hopefully help other researchers to perform better future studies and help to reduce the impact of delays on air transportation systems.