Back in the day, L2 served as LLC, most of the time ( except some special CPUs ) there was no L3 to fallback to. And it was very fast too. On Penryn it was ~15 cycles, incredibly tight for 6MB cache that was also shared between two cores. Contemporary Atoms have shared 4MBs of L2, that is ~20 cycles and doesn't clock much higher, that's how tight Intel had to design memory subsystem when their FSB architecture from 1985 was getting hammered by AMD.
Nehalem L2 was just mediocre ~ 12 cycles and tiny 256kb. The saving grace of this design was inclusive L3 and IMC that finally reduced overall memory latency while facilitating very fast inter core communications in emerging multithreaded processing world. The pinacle of this architecture was Ivy Bridge, that had same 8MB of L3 and tight IMC that could run DDR3 ~2400, giving mem latencies that were untouchable by later generations due to uncoupled uncore increasing latency for no good reason ( on desktop ).
I think the era of anemic caches completely private to cores is finally over. Everyone will have 1-2MB of L2, ARM, AMD, Intel, all of them.