There are large books written just on cache architecture and theory, but basically, the "surface scratching" goes like this:
Cache is physically a bank of RAM. It's usually located one the Die or in the case of the Slot X CPUs, on a seperate chip with a dedicated bus to the CPU.
The purpose of the cache is to keep the most frequently used data close to the CPU or on a retreival system that's faster than the "regular" ram (in the case of instruction cache) or an area of regular RAM that holds frequently accessed data, or an area of RAM that hold an index to frequently accessed disk-based data. Having a "holding tank" of information speeds up the access by reducing the processing time to figure out where it is and getting it.
When a process calls the memory (or disk) for data, it processes the data, but also stores a copy to cache (or where to find the data). When the cache is full, the management logic replaces some of the old cache data with the newer data. How the decision is made depends on the management logic. Some systems used to replace the oldest data with newest data....like a circular buffer. Newer/better management logic replaces the least accessed data with the newer information. Some caches won't cache a single access...the data must be requested X number of times before it replaces some other value in the cache.
It's possible to have too much cache. When a process calls for data retrieval, it scans the cache first, to see if the information is available (if it is, then it's retrieved...faster than a standard access), if it's not, then the usual fetch routines kick in and get the data (and probably caches it). If you have an extremely large cache, and the data is not in there, the time searching the cache is essentially wasted. That's the "art" of cache design; make it large enough to serve some high percentage of accesses, but small enough so that cache "misses" don't delay a standard fetch any more than necessary. The management logic and set association design also are a big part in the efficiency.
So there ya go. This is a very general overview, there are a boatload of details missing, but hopefully, this will give you the general idea.
FWIW
Scott