L3 cache nixed from P4; perhaps my old idea wasn't bad, eh?

MadRat

Lifer
Oct 14, 1999
11,967
280
126
All those so called experts said that an L3 cache was useless. I didn't believe it, especially after seeing how a FSB-based L3 cache on 2mb Socket-7 boards improved the K6-3's by up to 20% in processor-heavy benchmarks. The Pentium 4 dropped it out of cost-cutting issues and this could be crippling its performance. The Athlon dropped it, too, with the premise that it was unnecessary when the L1 and L2 caches were so big. AMD is reviving it for the server boards and to make SMP possible. VIA/Cyrix dropped it in the C5x family because an L3 cache wasn't in the Socket-370 specs.

I assuming that the processors have to have registers to address L3 cache, therefore the CPU must be designed to support the L3. It is easier to search smaller registers for data, making L3 caches operating at FSB-speeds much more efficient than main memory. If not, someone correct my flow of logic here.

I think the L3 caches of the future will be 1mb+ by the end of next year. The L1 cache will probably continue to shrink on Intel processors and get bigger on AMD processors. Who knows, maybe we'll even see the advent of a dedicated L4 cache to the SiMD channels before computers get too far along.
 

Sunner

Elite Member
Oct 9, 1999
11,641
0
76
For servers and some workstations, sure it will be useful.
For homeusers, it would be a waste of money to little benefit.

Remember when Tom compared a 2 MB Xeon to a regular P2 back when the 450 MHz Xeons were the top of the line?
In just about anything a consumer would ever be interested in the gains were extremely minimal.

I hardly think consumer CPU's will have L3's in any foreseeable future.
 

Sohcan

Platinum Member
Oct 10, 1999
2,127
0
0


<< All those so called experts said that an L3 cache was useless >>

No, we said the 128MB L3 SRAM cache you were suggesting was crazy. ;)

The old-style 1-2MB SRAM cache located on the motherboard would probably not be as beneficial as it used to be, since the Athlon and P4 would require very low latency caches...unless the cache can somehow bypass the memory controller, which incurs a 2-4 cycle access hit.

Intel/AMD could put a 1MB SRAM cache near the CPU, but that would require moving back to a cartridge format, which they have no desire to do.

A DRAM L3 cache, even using high-speed DDR SDRAM, would be useless. The higher bandwidth offered over the standard main memory would have no impact since the cache would still be limited by the 50-70ns access times inherant to all DRAM.

The only option that I see, which Micron is pursuing, is putting embedded DRAM on the northbridge. Since the data doesn't have to travel through the northbridge memory controller, it could cut the access time from 7-8 cycles of the fsb to 3-4 cycles (30-40ns using a 100MHz fsb). This would be relatively cheap, and could provide a high-bandwidth cache with a lower latency than that of the main memory.
 

MadRat

Lifer
Oct 14, 1999
11,967
280
126
Putting it in the northbridge would be pretty simple too, eh? Almost too simple. *chills*
 

WetSprocket

Senior member
Mar 13, 2000
543
0
0
On the Micron Mamba it is said to have room for an 8mb L3 cache with virtually no increase in production costs.
 

MadRat

Lifer
Oct 14, 1999
11,967
280
126
If Micron made chipsets we'd probably see more complexity for somewhat less cost. Look at what they did to SDRAM. :)
 

Usul

Golden Member
Nov 3, 2000
1,016
0
0
ok, only a moment.
What is the point in havong a L3 cache as fast as the regular ram?
I mean, why not to cache the ram at this point?
 

Goi

Diamond Member
Oct 10, 1999
6,770
7
91
We didn't say that an L3 cache would be useless. We said it would be uneconomical and inefficient for the consumer market. It does have a niche market in the high end/workstation/server market. AFAIK, HP and IBM have designs utilizing L3 cache, but they are expensive applications that aren't suitable for the mass market. Besides, to my knowledge even these high end applications only have something like 8MB L3 cache, not that humongous 128MB L3 cache that you were thinking of. 128MB of L3 cache using SRAM would be extremely expensive, and using DRAM would be too slow. Unless some other form of memory that is cheap and fast(not only in bandwidth, but more importantly in latency as well), L3 cache would remain out of reach, or unsuitable for the mass market.
 

Gogga

Member
Feb 6, 2000
177
0
0
I think it should be easier for Intel to implement a L3 cache, after they shrink their Cpu's down to a 0.13micron process. The die of the P4 is so big, that I'm sure Intel would have had a heck of a problem with heat and space to fit a L3 cache in there.

But with the memory bandwidth that Rambus provides, Intel thought perhaps that it was not crucial to put this L3 cache in there. A bigger L2 cache would be nice though ;)
 

MadRat

Lifer
Oct 14, 1999
11,967
280
126
If I could pay an extra $50 to $100 for 1-2mb cache of super fast cache then I'd do it. So it only improves a handful of things, it would still be something to brag about. :)
 

Goi

Diamond Member
Oct 10, 1999
6,770
7
91
Sure, it'll be something to brag about, but it wouldn't be economical or efficient, and there's be no market for it, except perhaps for you, hence our argument that your idea isn't that good. Anyway, let's end the argument here, if you really want a high end system, spend your money on SCSI drives and Alpha systems :)
 

PCResources

Banned
Oct 4, 2000
2,499
0
0


<< if you really want a high end system, spend your money on SCSI drives and Alpha systems >>



And the Alphas will also meet your requests of a L3 cache... ;-)
 

MadRat

Lifer
Oct 14, 1999
11,967
280
126
I don't recall any games offhand that run on the Alpha.

Are there any?
 

jpprod

Platinum Member
Nov 18, 1999
2,373
0
0
What is the point in havong a L3 cache as fast as the regular ram?

I don't know about the i850, but i840 has a prefetch cache onboard which helps to hide RDRAM's latency - in a sense it's the third level of caching. Although i840's prefetch cache is actually slower in bandwidth than the dual-channel memory subsystem, it helps performance because P3's GTL+ bus couldn't handle the huge 3.2GB/s bandwidth anyway.

I speculate that Mamba's 8MB L3 cache will be a prefetch cache just as on i840. Effective memory-to-CPU burst bandwidth can never exceed that of the frontside bus (which on future Athlons is 2.1GB/s, DDR 266MHz), but prefetch cache can keep the real-world bandwidth closer to that of burst bandwidth.