Dynamic planning of water infrastructure requires identifying signals for adaptation, including measures of system performance linked to vulnerabilities. However, it remains a challenge to detect projected changes in performance outside the envelope of natural variability, and to identify whether such detections can be attributed to one or more uncertain drivers. This study investigated these questions using a combination of ensemble simulation, nonparametric tests, and variance decomposition, which were demonstrated for a case study of the Sacramento-San Joaquin River Basin, California. We trained a logistic regression classifier to predict future detections given observed trends in performance over time. The scenario ensemble includes coupled climate and land-use change through the end of the century, evaluated using a multireservoir simulation model to determine changes in water supply reliability and flooding metrics relative to the historical period (1951-2000). The results show that the reliability metric is far more likely to exhibit a significant change within the century, with the most severe scenarios tending to be detected earlier, reflecting long-term trends. Changes in flooding often are not detected due to natural variability despite severe events in some scenarios. We found that the variance in detection times is attributable largely to the choice of climate model, and also to the emissions scenario and its interaction with the choice of climate model. Finally, in the prediction model for both cases, reliability and flooding, the model learns to associate more-recent observations of system performance with nonstationarity detection. These findings underscore the importance of differentiating between long-term change and natural variability in identifying signals for adaptation.