Purpose: Assess the impact of false-positives (FP), false-negatives (FN), fixation losses (FL), and test duration (TD) on visual field (VF) reliability at different stages of glaucoma severity. Design: Retrospective. Participants: A total of 10 262 VFs from 1538 eyes of 909 subjects with suspect or manifest glaucoma and >= 5 VF examinations. Methods: Predicted mean deviation (MD) was calculated with multilevel modeling of longitudinal data. Differences between predicted and observed MD (Delta MD) were calculated as a reliability measure. The impact of FP, FN, FL, and TD on Delta MD was assessed using multilevel modeling. Main Outcome Measures: Delta MD associated with a 10% increment in FP, FN, and FL, or a 1-minute increase in TD. Results: FL had little impact on Delta MD (<0.2 decibels [dB] per 10% abnormal catch trials), and no level of FL produced >= 1 dB of DMD at any disease stage. FP yielded greater than expected MD, with a 10% increment in abnormal catch trials associated with a Delta MD -0.42, 0.73, and 0.66 dB in mild (MD >-6 dB), moderate (-6 <= MD < -12 dB), and severe (-12 <= MD <=-20 dB) disease, respectively, up to 20% abnormal catch trials, and a Delta MD = 1.57, 2.06, and 3.53 dB beyond 20% abnormal catch trials. FNs generally produced observed MDs below expected MDs. FN were minimally impactful up to 20% abnormal catch trials (Delta MD per 10% increment >-0.14 dB at all levels of severity). Beyond 20% abnormal catch trials, each 10% increment in abnormal catch trials was associated with a Delta MD = -1.27, -0.53, and -0.51 dB in mild, moderate, and severe disease, respectively. |Delta MD| >= 1 dB occurred with 22% FP and 26% FN in early, 14% FP and 34% FN in moderate, and 16% FP and 51% FN in severe disease. A 1-minute increment in TD produced Delta MDs between -0.35 and -0.40 dB. Conclusions: FL have little impact on reliability in patients with established glaucoma. FP, and to a lesser extent FNs and TD, significantly affect reliability. The impact of FP and FN varies with disease severity and over the range of abnormal catch trials. On the basis of our findings, we present evidence-based, severity-specific standards for classifying VF reliability for clinical or research applications. (C) 2017 by the American Academy of Ophthalmology