On most systems, memory is structured in a hierarchy that contains a small amount of faster, more expensive memory at the top (e.g., CPU registers) and a large amount of slower memory at the base (e.g., hard disks). As memory is referenced, it is automatically copied into higher levels of the hierarchy, so data that is referenced most often migrates to the fastest memory locations.
The goal of machine designers and programmers is to maximize the chance of finding data as high up in this memory hierarchy as possible. To achieve this goal, algorithms for maintaining the hierarchy, embodied in the hardware and the operating system, assume that programs have locality of reference in both time and space; that is, programs are much more likely to access a location recently accessed or those nearby it, than elsewhere. Performance increases if you respect the degree of locality required by each level in the memory hierarchy.