Originally posted by: MadRat
I was speaking more to the idea of the top runners on the market where an extra 64MB does cost around an additional $100.
That extra 64MB costs an extra $100 because the cards themselves cost about $1000. You're comparing workstations to gaming machines. Another 64MB does make a difference when rendering.
Actually, these days, those cards sport 256MB or more, easily.
I'm looking at it from a revenue point of view. Intel is into commoditization of the market because they virtually own the x86 world. If they can design a product that coaxes $10 more profit per unit then they just upped the ante. The simplification of the Intel product lines is a direct result of their market share being pretty well maxed out and the market itself tailoring off its growth potential, if nothing else for lack of a new gee whiz product.
Adding more cache doesn't equate to just $10 more. Sure, the sand is cheap, but semiconductor manufacturing may be the only industry where the equipment costs virtually erase material costs. A fab itself costs a few billion to build and may operate at that level for only a few years before requiring an upgrade to a new process. Start calculating how many chips you can build in that time and how much energy and wages you'll have to pay for. Oh, don't forget backups and insurance.
Yield is the important issue. Increasing die size by a few square mm may end up costing you anywhere from 10% to 50% of your optimal yield depending on where the flaws are likely to show up. This is the main reason nobody makes a wafer sized CPU. It's simply not practical. Die sizes are usually calculated and/or adjusted to maintain net revenue. If a die of 100 CPU's gives me 60 good chips, assume doubling cache size gives a CPU exactly 1.5 as large so the wafer gets me at most 66 CPUs.
However, wafers are round, so 66 may not fit, let's try 60.
Less CPUs means more flaws will appear on a given die, so yield becomes 20.
These 20 CPU's now cost 3x as much as the original 60 with less cache, and may yield anywhere from 0-.
People who aren't running time-critical applications where even the smallest reduction in execution time means something will most likely buy the CPU that costs ten times as much but gives back only 2% extra performance.
The console isn't everything its cracked up to be. Besides that, there are relatively "old timer" products on the market that don't run a whole lot faster on the new architecture simply because they are memory bandwidth constrained. The move to 512k of cache was a nice gesture from Intel/AMD, but it doesn't have the macho appeal of sticking on bigger memory cache modules like we had in the days of the first Pentiums.
I find it highly unlikely AMD and Intel moved to 512K L2 simply to please customers. Intel probably did it because the PIV seriously needed it. Plus, it's always possible they calculated optimum die space vs yield and found room left over to fill up with extra memory. AMD, on the other hand, probably released 512K as a holdover until Hammer could reach the desktop and also as a way to counter the PIV's bigger number.
The reason sticking extra cache improved performance so well with Pentiums is due entirely to older technology. The Pentium had a front side bus of 60 or 66MHz. The back side bus wasn't any better, if I recall. However, the main performance advantage likely came from reduced latency vs the main memory's 60ns. Add to that the severe lack of available memory in the first place, which results in frequent disk access, and you can see where most of the performance is coming from.
Nowadays, memory isn't as constrained. Just look around and you'll see plenty of debates over whether 512MB is better than 1GB which usually end up with a grudging "In most casts, 512 is enough, but we're seeing some programs taking up more." Also, compilers are a lot better these days as well as data structure development, each doing its part to reduce disk access.
Truth be told, we're probably already beyond the point where the prescence of serial and parallel memory accesses play a large role in performance. RDRAM runs fast enough to compete with DDR and vice versa. By the time you figure out which will end up faster, something better will be available and the point will be moot.
If you really believe cache makes all the difference in the world, I suggest you try using a Xeon with 2MB and compare it to a Xeon with 512K. See what kind of performance gains you get, then take a good look at the extra $3000 the 2MB version costs.
If adding such "monstrous" cache size is so cheap and easy and the performance gain is so noticeable, why don't any desktop CPU's have them? If the cost was only 10% for a large percentage gain, don't you think a lot of people would pay that extra 10%? At that point, you've got a Williamette vs a Northwood.
The thing is, the extra cost is already included in CPU prices, and it isn't always 10% or less.