This paper describes a method for reducing the information contained in an image sequence, while retaining the information necessary for the interpretation of the sequence by a human observer. The method consists of first locating the redundant information, reducing the degree of redundancy, and coding the result. The sequence is treated as a single 3-D data volume, the voxels of which are grouped into several regions, obtained by a 3-D split and merge algorithm. To find these regions, we first obtain an initial region space by splitting the image sequence until the gray-level variation over each region can be approximated by a 3-D polynomial, to a specified accuracy. This results in a set of parallelepipedic regions of various sizes. To represent the gray-level variation over these regions, the coefficients of the approximating polynomial are used as features. The most similar regions are then merged, using a region adjacency graph. The information is coded by representing the borders of the regions using a pyramidal structure in the x, y, t space. The coefficients of the approximating polynomials are coded in a straightforward manner. For 256 x 256 pixel, 25 frames/s image sequences, compressions allowing transmission rates near 64 kbit/s are obtained.