1. Dividing each video frame into blocks of pixels so that processing of the video frame can be
conducted at the block level.
2. Exploiting the spatial redundancies that exist within the video frame by coding some of the original
blocks through transform, quantization and entropy coding (or variable-length coding).
3. Exploiting the temporal dependencies that exist between blocks in successive frames, so that only
changes between successive frames need to be encoded. This is accomplished by using motion
estimation and compensation. For any given block, a search is performed in the previously coded one
or more frames to determine the motion vectors that are then used by the encoder and the decoder to
predict the subject block.
4. Exploiting any remaining spatial redundancies that exist within the video frame by coding the
residual blocks, i.e., the difference between the original blocks and the corresponding predicted
blocks, again through transform, quantization and entropy coding.
This is the compression paradigm for h264 from the above attachment.

Note step 3.
This step - 'Exploiting the temporal dependencies that exist between blocks in successive frames, so that only
changes between successive frames need to be encoded"

this assumes that the video goes linearily and uniformly in one direction - forwards.

Not at all good for random frame access - or seemless looping between any two points in a movie.