Objective: To evaluate the impact of case mix variation on the performance of the Acute Physiology and Chronic Health Evaluation (APACHE) II using measures of calibration and discrimination, Design: APACHE II data were collected prospectively at the surgical intensive care unit of the University of Vermont on all adult admissions over an 8-yr period (excluding cardiac surgical patients, burn patients, and patients <16 yrs of age), The original case mix was systematically varied to create 2,000 different case mixes ranging in mortality between 5% and 18% using a computer-intensive resampling algorithm. The area under the receiver operating characteristic curve and the Hosmer-Lemeshow C statistic were derived for each of the simulated case mixes with bootstrapping. Setting. The surgical intensive care unit at a 450-bed teaching hospital. Patients: A group of 6,806 adult surgical patients excluding cardiac surgical patients and burn patients. Measurements and Results: Simulated data sets were created from a database of patients treated at a single institution to test the hypothesis that the performance of APACHE II is stable across a clinically reasonable range of mortality rates, The discrimination and calibration of APACHE II varied with case mix, Conclusion:The discrimination of APACHE II is not independent of case mix. However, the variability of the Hosmer-Lemeshow statistic as a function of the case mix may simply reflect the limitations of this goodness of fit statistic to assess model calibration, Because the discrimination of APACHE II is a function of case mix, caution should be exercised when using APACHE II-based adjusted mortality rates to compare intensive care units with widely divergent case mixes.