DaddyG - not quite true (but the right idea
). The L1 IS NOT always separated into separate caches. Most of the time, the L1 IS split up into a data cache (Dcache) and an instruction cache (Icache). Traditionally, if there was a difference in the size between the two L1 caches, it was the instruction cache being smaller (data requires FAR more room).
There is an alternative to this "split" architecture (reffered to as "harvard architecture"
: Unified. The Cyrix 686, 686mx, M2, and Joshua version of the Cyrix III all used a single, unified L1 cache. 486's used a unified L1 cache too.