The Fermat principle is applicable in arbitrary, stationary, and nonstationary metrics, for massive and massless particles, as well as for massive and massless observers. I give a general derivation, along with several ways to regard the principle. The practical usefulness of the Fermat principle is restricted to cases in which a sufficiently small set of suitable trial paths can be found. Zigzag trial paths are suitable for thin comoving gravitational lenses, to the first order in the lens effect, and also if there is a nonstationary perturbation. As an example, I apply the Fermat principle to a perturbation by gravitational waves and derive the transverse velocity of the caustics motion. This velocity poses a difficulty for the proposition by McBreen and Metcalfe that γ-bursts come from a small hot BL Lac cores crossed by microcaustics, the caustics being jittered by gravitational waves. A stochastic gravitational radiation background cannot account for the γ-bursts registered by different spacecraft separated by interplanetary distances, since the produced caustic motion is too slow. If, on the other hand, there might be local sources of gravitational radiation to produce super-luminal caustic motion, the light travel time direction annuli would cluster at the solar system poles.