Recent work has suggested that reward prediction errors elicit a positive voltage deflection in the scalp-recorded electroencephalogram (EEG); an event sometimes termed a reward positivity. However, a strong test of this proposed relationship remains to be defined. Other important questions remain unaddressed: such as the role of the reward positivity in predicting future behavioral adjustments that maximize reward. To answer these questions, a three-armed bandit task was used to investigate the role of positive prediction errors during trial-by-trial exploration and task-set based exploitation. The feedback-locked reward positivity was characterized by delta band activities, and these related EEG features scaled with the degree of a computationally derived positive prediction error. However, these phenomena were also dissociated: the computational model predicted exploitative action selection and related response time speeding whereas the feedback-locked EEG features did not. Compellingly, delta band dynamics time-locked to the subsequent bandit (the P3) successfully predicted these behaviors. These bandit-locked findings included an enhanced parietal to motor cortex delta phase lag that correlated with the degree of response time speeding, suggesting a mechanistic role for delta band activities in motivating action selection. This dissociation in feedback vs. bandit locked EEG signals is interpreted as a differentiation in hierarchically distinct types of prediction error, yielding novel predictions about these dissociable delta band phenomena during reinforcement learning and decision making. (C) 2015 Elsevier Inc. All rights reserved.