This paper describes Mtool, a software tool for analyzing performance losses in shared memory parallel programs. Mtool augments a program with low overhead instrumentation which perturbs the program's execution as little as possible while generating enough information to isolate memory and synchronization bottlenecks. After running the instrumented version of the parallel program, the programmer can use Mtool's window-based user interface to view compute time, memory, and synchronization bottlenecks at increasing levels of detail from a whole program level down to the level of individual procedures, loops, and synchronization objects. The paper describes Mtool's low overhead instrumentation methods, memory bottleneck detection technique, and attention focusing mechanisms, contrasts Mtool with other approaches, and offers a case study to demonstrate the effectiveness of Mtool.