The need for parallelism in the time dimension is being driven by changes in computer architectures, where performance increases are now provided through greater concurrency, not faster clock speeds. This creates a bottleneck for sequential time marching schemes because they lack parallelism in the time dimension. Multigrid reduction in time (MGRIT) is an iterative procedure that allows for temporal parallelism by utilizing multigrid reduction techniques and a multilevel hierarchy of coarse time grids. MGRIT has been shown to be effective for linear problems, with speedups of up to 50 times. The goal of this work is the efficient solution of nonlinear problems with MGRIT, where efficiency is defined as achieving similar performance when compared to an equivalent linear problem. The benchmark nonlinear problem is the p-Laplacian, where p = 4 corresponds to a well-known nonlinear diffusion equation and p = 2 corresponds to the standard linear diffusion operator, our benchmark linear problem. The key difficulty encountered is that the nonlinear time step solver becomes progressively more expensive on coarser time levels as the time-step size increases. To overcome such difficulties, multigrid research has historically targeted an accumulated body of experience regarding how to choose an appropriate solver for a specific problem type. To that end, this paper develops a library of MGRIT optimizations and modifications, most important an alternate initial guess for the nonlinear time-step solver and delayed spatial coarsening, that will allow many nonlinear parabolic problems to be solved with parallel scaling behavior comparable to the corresponding linear problem.