In this paper, we comprehensively discuss the current progress of visual-inertial (VI) navigation systems and sensor fusion research with a particular focus on small unmanned aerial vehicles, known as microaerial vehicles (MAVs). Such fusion has become very topical due to the complementary characteristics of the two sensing modalities. We discuss the pros and cons of the most widely implemented VI systems against the navigational and maneuvering capabilities of MAVs. Considering the issue of optimum data fusion from multiple heterogeneous sensors, we examine the potential of the most widely used advanced state estimation techniques (both linear and nonlinear as well as Bayesian and non-Bayesian) against various MAV design considerations. Finally, we highlight several research opportunities and potential challenges associated with each technique. Note to Practitioners-Robotic aircraft have been widely implemented to improve safety, efficiency, and productivity (e.g., agriculture, law enforcement, building inspections, and so on). As a part of its autonomous navigation system, this review aims to address several aspects of VI navigation systems both from data fusion and technological perspectives.