The fact that we're seeing 3 different Navi dies, one of which is monolithic would seem to suggest to me that there are still some kinks to be worked out when it comes to designing MCM GPUs. Otherwise AMD would have just created a single Navi chiplet that could be used in any number necessary to hit a particular performance level.
I don't know. It's certainly possible that NVidia could split up the product line across multiple nodes, but it does mean a lot of additional design work since you can't just create a single design for N5 that will magically work if back-ported to N6 without some tweaking.
Those changes may be the reason why the power went through the roof.
Why nobody mentioned this?
"I must clarify that the current AD102 is NOT the original AD102. Ada Lovelace is no longer a simple Ampere refresh, although it was like this in the beginning."
I like how he's covering for his initial mistake. I suspect all along Ada Lovelace was a new architecture. He may have been fed bad info. Take what he says with a grain of salt.
Hard to say, right? Nvidia called Pascal a new architecture but under the hood, it was more or less an updated Maxwell on a newer node, with higher clocks, better memory compression, and some added features for VR. No IPC gains or a revamp of the SM.Guess there's a fine line between a "new architecture" and a "simple refresh". If they went to 192 cores/SM plus the extra L2, is that really a new architecture? Maybe if they beefed up the RT engine?
I like how he's covering for his initial mistake. I suspect all along Ada Lovelace was a new architecture. He may have been fed bad info. Take what he says with a grain of salt.
He told that
Lovelace 2,2x of GA102 and Navi31 2,5x of Navi21
But GA102 is at least 10% faster than Navi21 at 4K
So
2,2*1,1=2,42
So
Navi21 = 1x
GA102 = 1,1x
AD102 = 2,42x
Navi31 = 2,5x
Kopite has good source but what he says is non sense. Lovelace architecture was frozen 10 months ago and nothing has changed since. It's not a trivial task to get these multi billion transistors dies and you can't update a design in the middle of the road, unless you want to delay the product by several months. So once frozen, you have to do algorithm validation, circuit design, xtor floor plan, AI optimization, RTL generation / tape out. AD102 Silicon came back 3 weeks ago. Another 4-6 months of debugging and qualification, and AD102 HVM should be ready for back to school launchNo, that is NOT what he means, at all .
Its just wording, but the sense of chronological order suggests that there were vast changes to the architecture.
Those changes may be the reason why the power went through the roof.
Or simply heavily OC'ed silicon, which is why the power went from 450W to 600W, for halo SKU, but standard version.Is it possible that NV was running a couple different archs teams against each other, the leaks are from Arch team A, but NV will be going with Arch team B's more moonshot design or something?
A company with as much to spend on R&D as NV must have their fingers in a bunch of pies, and we know they design variations on their arch for different markets.
Changing the arch name is like the simplest thing to do.
Or simply heavily OC'ed silicon, which is why the power went from 450W to 600W, for halo SKU, but standard version.
102 TFLOPs equals to 2.75 GHz on 18452 CUDA cores.
It could be some cleverly designed beefier cooler. And the huge amount of cache is going to guarantee huge gains.To add a 50% larger heat sink on top of that?
The top tier may be 40-60% faster than a 3090. That is my guess.