Visual tracking estimates the trajectory of an object of interest in non-stationary image streams that change over time. Recently, approaches for model-free tracking have received increased interest since manually annotating sufficient examples of all objects in the world is prohibitively expensive. By definition, a model-free tracker has only one labeled instance in the form of an identified object in the first frame. In the subsequent frames, it has to learn variations of the tracked object with only unlabeled data available. There exists a dilemma for model-free trackers, i.e., whether the tracker would shift the focus to clutters (i.e., adaptivity) or result in very short tracks (i.e., stability) largely depends on how sensitive the appearance model is. In contrast to recent survey efforts with data-driven approaches focusing on the performance on benchmarks, this article aims to provide an in-depth survey on solutions to the dilemma between adaptivity and stability in model-free tracking focusing on the ability of achieving situation awareness, i.e., learning the object appearance adaptively in a non-stationary environment. The survey results show that, regardless of visual representations and statistical models involved, the way of exploiting unlabeled data in the changing environment and the extent of how rapidly the appearance model need be updated accordingly with selected example(s) of estimated labels are the key to many, if not all, evaluation measures for tracking. Such conceptual consensuses, despite the diversity of approaches in this field, for the first time capture the essence of model-free tracking and facilitate the design of visual tracking systems.