We focus on modeling the relationship between an input feature vector and the predicted outcome of a trained decision tree using mixed-integer optimization. This can be used in many practical applications where a decision tree or a tree ensemble is incorporated into an optimization problem to model the predicted outcomes of a decision. We propose novel tight mixed-integer optimization formulations for this problem. Existing formulations can be shown to have linear relaxations that have fractional extreme points, even for the simple case of modeling a single decision tree or a very large number of constraints, which leads to slow solve times in practice. A formulation we propose, based on a projected union of polyhedra approach, is ideal (i.e., the extreme points of the linear relaxation are integer when required) for a single decision tree. Although the formulation is generally not ideal for tree ensembles, it generally has fewer extreme points, leading to a faster time to solve. We also study formulations with a binary representation of the feature vector and present multiple approaches to tighten existing formulations. We show that fractional extreme points are removed when multiple splits are on the same feature. At an extreme, we prove that this results in an ideal formulation for a tree ensemble modeling a one-dimensional feature vector. Building on this result, we also show that these additional constraints result in significantly tighter linear relaxations when the feature vector is low dimensional.